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Borrelia burgdorferi Polynucleotides and Sequences 
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Field of the Invention 

The present invention relates to the field of molecular biology. In particular, it relates to, 
among other things, nucleotide sequences of Borrelia burgdorferi, contigs, ORFs, fragments, 
10 probes, primers and related polynucleotides thereof, peptides and polypeptides encoded by the 
sequences, and uses of the polynucleotides and sequences thereof, such as in fermentation, 
polypeptide production, assays and pharmaceutical development, among others. 

Statement as to Rights to Inventions Made Under 
15 Federally-Sponsored Research and Development 

Part of the work performed during development of this invention utilized U.S. 
Government funds. The U.S. Government may have certain rights in the invention - DE-FC02- 
95ER61962; DE-FC02-95ER61963; and NAGW 2554. 

20 

Background of the Invention 

Spirochetes are a family of motile, unicellular, spiral-shaped bacteria which share a 
number of structural characteristics. Three genera of the spirochetes are pathogenic in humans: 

25 (a) Treponema, which includes the pathogens that cause syphilis (71 pallidum), yaws (T. 
pertenue), and pinta (T. carateum); (b) Borrelia, which includes the pathogens that cause 
epidemic and endemic relapsing fever and Lyme disease; and (c) Leptospira, which includes a 
wide variety of small spirochetes that cause mild to serious systemic human illness (Koff, A. B. 
and Rosen, T. J. Am. Acad. Dermatol. 29:519-535 (1993)). 

30 Lyme borreliosis, more commonly known as Lyme disease, is presently the most 

common human disease in the United States transmitted by an arthropod vector. Centers for 
Disease Control, Morbid. Mortal. Weekly Rep. 44:590-591 (1995). Further, infection of house- 
hold pets, such as dogs, is a considerable problem. The causative agent of this affliction is the 
spirochete Borrelia burgdorferi, which is generally transmitted to mammalian hosts by feeding 

35 ticks. Barbour, A. and Fish, D. Science 260:1610-1616 (1993). Once the bacteria pass through 
the skin they disseminate and produce a variety of clinical manifestations. Diagnosis of this 
disease is often made serologically by the identification of antiborrelial antibodies. Hilton, E. et 
ah, J. Clin. Microbiol. 35:774-776 (1997). 
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While initial symptoms often include a rash at the infection point, Lyme disease is a 
multisystemic disorder that may include arthritic, carditic, and neurological manifestations. 
While antibiotics are currently used to treat active cases of Lyme disease, B, burgdorferi appears 
to be able to persist even after prolonged antibiotic treatment. Further, B. burgdorferi can persist 
5 for years in a mammalian host even in the presence of an active immune response. Straubinger, 
R. et al., J. Clin. Microbiol. 35:111-116 (1997); Steere, A., N. Engl. J. Med. 321:586-596 
(1989). 

Animal models have proven useful for studying the progression of Lyme disease, 
methods for preventing this disease, and immunological responses to antigenic challenges with 

10 B. burgdorferi proteins. Garcia-Monoco, J. et al., J. Infect. Dis. 175:1243-1245 (1997). Using 
a canine model, Starubinger, R. et al., Infect. Immun. 65:1273-1285 (1977), demonstrated that 
B. burgdorferi migrates into joints and induces up-regulation of interleukin-8 in synovial 
membranes. Similarly, B. burgdorferi induction of interleukin-8 production has been 
demonstrated in cultured human endothelial cells. Burns, M. et al., Infect. Immun. 65:1217- 

15 1222 (1997). ^ - ' 

Antigenic heterogeneity has been postulated as a mechanism used by B. burgdorferi for 
evasion of host immune responses. Schwan, T. et al., Can. J. Microbiol. 37:450-454 (1991). " 
In support of this mechanism, antigenic variation has been described with other pathogenic 
bacteria. Hagbloom, P. et al., Nature 315:156-158 (1985). Further, cassette type genetic 

20 recombination of genes encoding B. burgdorferi surface proteins has been shown to decrease the 
antigenicity of these organisms to antibodies generated against strains which have not undone the 
same recombination. Zhang, J. et al., Cell 89:275-285 (1997). 

A number of different types of Lyme disease vaccines have been tested and shown to 
induce immunological responses. Whole-cell B. burgdorferi vaccines have been shown to 

25 induce both immunological responses and protective immunity in several animal models. 
Reviewed in Wormser, G., Clin. Infect. Dis. 21:1267-1274 (1995). For example, dogs 
inoculated with a chemically inactivated whole-cell vaccine primarily develop antibodies to outer 
surface membrane proteins of the administered organism. Further, passive immunity has been 
also demonstrated in animals using B. burgdorferi specific antisera. Similarly, passive immunity 

30 is conferred human by the administration of sera obtained from Lyme disease patients. 

While whole-cell Lyme disease vaccines confer protective immunity in animal models, 
use of such vaccines pre^Stfs the risk that responsive antibodies will be generated which cross 
react with human antigens. Reviewed in Wormser, G., supra. This problem is at least partly the 
result of the production of B. burgdorferi specific antibodies which cross-react with hepatocytes 

35 and both muscle and nerve cells. 5. burgdorferi heat shock proteins and the 41-kd flagellin 
subunit are believed to contain the antigens against which these cross-reactive antibodies are 
generated. 

It is clear that the etiology of diseases mediated or exacerbated by B. burgdorferi genes, 
and that characterizing the genes and their patterns of expression would add dramatically to our 
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understanding of the organism and its host interactions. Knowledge of B. burgdorferi genes 
and genomic organization would dramatically improve understanding of disease etiology and lead 
to improved and new ways of preventing, ameliorating, arresting and reversing diseases. 
Moreover, characterized genes and genomic fragments of B. burgdorferi would provide reagents 
for, among other things, detecting, characterizing and controlling B. burgdorferi infections. 
There is a need therefore to characterize the genome of B. burgdorferi and for polynucleotides 
and sequences of this organism. 

SUMMARY OF THE INVENTION 

The present invention is based on the sequencing of fragments of the Borrelia burgdorferi 
genome. The primary nucleotide sequences which were generated are provided in SEQ ID 
NOS:l-155. 

The present invention provides the complete nucleotide sequence of the Borrelia 
burgdorferi chromosome and 154 contigs representing the majority of the sequence of the B . 
burgdorferi extrachromosomal elements, all of which are listed in tables below and set out in the 
Sequence Listing submitted herewith, and representative fragments thereof, in a form which can 
be readily used, analyzed, and interpreted by a skilled artisan. In one embodiment, the present 
invention is provided as contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted in SEQ ID NOS: 1-155. 

The present invention further provides nucleotide sequences which are at least 95%, 
96%, 97%, 98%, and 99%, identical to the nucleotide sequences of SEQ ID NOS:l-155, ORF 
IDs and corresponding ORFs. 

The nucleotide sequences of SEQ ID NOS: 1-155, ORF ID or ORF within, a 
representative fragment thereof, or a nucleotide sequence which is at least 95% identical to said 
nucleotide sequence may be provided in a variety of mediums to facilitate its use. In one 
application of this embodiment, the sequences of the present invention are recorded on computer 
readable media. Such media includes, but is not limited to: magnetic storage media, such as 
floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD- 
ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as 
magnetic/optical storage media. 

The present invention further provides systems, particularly computer-based systems 
which contain the sequence information herein described stored in a data storage means. Such 
systems are designed to identify commercially important fragments of the Borrelia burgdorferi 
genome. 

Another embodiment of the present invention is directed to fragments of the Borrelia 
burgdorferi genome having particular structural or functional attributes. Such fragments of the 
Borrelia burgdorferi genome of the present invention include, but are not limited to, fragments 
which encode peptides, hereinafter referred to as open reading frames or ORFs, fragments which 
modulate the expression of an operably linked ORF, hereinafter referred to as expression 



WO 98/58943 



# 



PCT/US98/12764 



4 



modulating fragments or EMFs, and fragments which can be used to diagnose the presence of 
Borrelia burgdorferi in a sample, hereinafter referred to as diagnostic fragments or DFs. 

Each of the ORF IDs and ORFs in fragments of the Borrelia burgdorferi genome 
disclosed in Tables 1-6, and the EMFs found 5' prime of the initiation codon, can be used in 
numerous ways as polynucleotide reagents. For instance, the sequences can be used as 
diagnostic probes or amplification primers for detecting or determining the presence of a specific 
microbe in a sample, to selectively control gene expression in a host and in the production of 
polypeptides, such as polypeptides encoded by ORFs of the present invention, particular those 
polypeptides that have a pharmacological activity. 

The present invention further includes recombinant constructs comprising one or more 
fragments of the Borrelia burgdorferi genome of the present invention. The recombinant 
constructs of the present invention comprise vectors, such as a plasmid or viral vector, into 
which a fragment of the Borrelia burgdorferi has been inserted. 

The present invention further provides host cells containing any of the isolated fragments 
of the Borrelia burgdorferi genome of the present invention. The host cells can be a higher 
eukaryotic host cell, such as a mammalian cell, a lower eukaryotic cell, such as a yeast cell, or a 
procaryotic cell such as a bacterial cell. 

The present invention is further directed to isolated polypeptides and proteins encoded by 
ORFs of the present invention. A variety of methods, well known to those of skill in the art, 
routinely may be utilized to obtain any of the polypeptides and proteins of the present invention. 
For instance, polypeptides and proteins of the present invention having relatively short, simple 
amino acid sequences readily can be synthesized using commercially available automated peptide 
synthesizers. Polypeptides and proteins of the present invention also may be purified from 
bacterial cells which naturally produce the protein. Yet another alternative is to purify 
polypeptide and proteins of the present invention from cells which have been altered to express 
them. 

The invention further provides methods of obtaining homologs of the fragments of the 
Borrelia burgdorferi genome of the present invention and homologs of the proteins encoded by 
the ORFs of the present invention. Specifically, by using the nucleotide and amino acid 
sequences disclosed herein as a probe or as primers, and techniques such as PCR cloning and 
colony/plaque hybridization, one skilled in the art can obtain homologs. 

The invention further provides antibodies which selectively bind polypeptides and 
proteins of the present invention. Such antibodies include both monoclonal and polyclonal 
antibodies. 

The invention further provides hybridomas which produce the above-described 
antibodies. A hybridoma is an immortalized cell line which is capable of secreting a specific 
monoclonal antibody. 

The present invention further provides methods of identifying test samples derived from 
cells which express one of the ORFs of the present invention, or a homolog thereof. Such 
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methods comprise incubating a test sample with one or more of the antibodies of the present 
invention, or one or more of the DFs of the present invention, under conditions which allow a 
skilled artisan to determine if the sample contains the ORF or product produced therefrom. 

In another embodiment of the present invention, kits are provided which contain the 
5 necessary reagents to carry out the above-described assays. 

Specifically, the invention provides a compartmentalized kit to receive, in close 
confinement, one or more containers which comprises: (a) a first container comprising one of the 
antibodies, or one of the DFs of the present invention; and (b) one or more other containers 
comprising one or more of the following: wash reagents, reagents capable of detecting presence 
10 of bound antibodies or hybridized DFs. 

Using the isolated proteins of the present invention, the present invention further provides 
methods of obtaining and identifying agents capable of binding to a polypeptide or protein 
encoded by one of the ORFs of the present invention. Specifically, such agents include, as 
further described below, antibodies, peptides, carbohydrates, pharmaceutical agents and the like. 
15 Such methods comprise steps of: (a)contacting an agent with an isolated protein encoded by one 
of the ORFs of the present invention; and (b)determining whether the agent binds to said protein. 

The present genomic sequences of Borrelia burgdorferi will be of great value to all 
laboratories working with this organism and for a variety of commercial purposes. Many 
fragments of the Borrelia burgdorferi genome will be immediately identified by similarity 
20 searches against GenBank or protein databases and will be of immediate value to Borrelia 

burgdorferi researchers and for immediate commercial value for the production of proteins or to 
control gene expression. 

The methodology and technology for elucidating extensive genomic sequences of 
bacterial and other genomes has and will greatly enhance the ability to analyze and understand 
25 chromosomal organization. In particular, sequenced contigs and genomes will provide the 

models for developing tools for the analysis of chromosome structure and function, including the 
ability to identify genes within large segments of genomic DNA, the structure, position, and 
spacing of regulatory elements, the identification of genes with potential industrial applications, 
and the ability to do comparative genomic and molecular phylogeny. 

30 

DESCRIPTION OF THE FIGURES 

FIGURE 1 is a block diagram of a computer system (102) that can be used to 
implement computer-based systems of present invention. 



FIGURE 2 is a schematic diagram depicting the data flow and computer programs used 
to collect, assemble, edit and annotate the contigs of the Borrelia burgdorferi genome of the 
present invention. Both Macintosh and Unix platforms are used to handle the AB 373 and 377 
sequence data files, largely as described in Kerlavage et aL, Proceedings of the Twenty-Sixth 
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Annual Hawaii International Conference on System Sciences, 585, IEEE Computer Society 
Press, Washington D.C. (1993). Factura (AB) is a Macintosh program designed for automatic 
vector sequence removal and end-trimming of sequence files. The program Loadis runs on a 
Macintosh platform and parses the feature data extracted from the sequence files by Factura to the 
5 Unix based Borrelia burgdorferi relational database. Assembly of contigs (and whole genome 
sequences) is accomplished by retrieving a specific set of sequence files and their associated 
features using Extrseq, a Unix utility for retrieving sequences from an SQL database. The 
resulting sequence file is processed to trim portions of the sequences with a high rate ambiguous 
nucleotides. The sequence files were assembled using TIGR Assembler, an assembly engine 

10 designed at The Institute for Genomic Research (TIGR ) for rapid and accurate assembly of 
thousands of sequence fragments. The collection of contigs generated by the assembly step is 
loaded into the database with the lassie program. Identification of open reading frames (ORFs) is 
accomplished by processing contigs with zorf. The ORFs are searched against B. burgdorferi 
sequences from GenBank and against all protein sequences using the BLASTN and BLASTP 

15 programs, described in Altschul et al, J. Mol Biol 215: 403-410 (1990). Results of the ORF 
determination and similarity searching steps were loaded into the database. As described below, 
some results of the determination and the searches are set out in Tables 1-6. 

DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS 

20 The present invention is based on the sequencing of fragments of the Borrelia burgdorferi 

genome and analysis of the sequences. The primary nucleotide sequences generated by 
sequencing the fragments are provided in SEQ ID NOS: 1-155. (As used herein, the "primary 
sequence" refers to the nucleotide sequence represented by the IUPAC nomenclature system.) 
SEQ ID NOS: 1-155 

25 In addition, the present invention provides the nucleotide sequences of SEQ ID NOS: 1- 

155, or representative fragments thereof, in a form which can be readily used, analyzed, and 
interpreted by a skilled artisan. 

As used herein, a "representative fragment of the nucleotide sequence depicted in SEQ ID 
NOS : 1 - 1 55 " refers to any portion of the SEQ ID NOS : 1 - 1 55 which is not presently 

30 represented within a publicly available database. Preferred representative fragments of the 

present invention are Borrelia burgdorferi open reading frames ( ORFs ) represented by ORF 
IDs, expression modulating fragments (EMFs) and diagnostic fragments (DFs)which can be used 
to diagnose the presence of Borrelia burgdorferi in sample. A non-limiting identification of 
preferred representative portions are provided in Tables 1-6 as ORF IDs. As discussed in detail 

35 below, the information provided in SEQ ID NOS: 1-155 and in Tables 1-6 together with routine 
cloning, synthesis, sequencing and assay methods will enable those skilled in the art to clone and 
sequence all "representative fragments" of interest, including ORFs encoding a large variety of 
Borrelia burgdorferi proteins. 
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The present invention is further directed to nucleic acid molecules encoding portions or 
fragments of the nucleotide sequences described herein. Fragments include portions of the 
nucleotide sequences of Table 1-6 (ORF IDs) and SEQ ID NOS: 1-155, at least 10 contiguous 
nucleotides in length selected from any two integers, one of which representing a 5* nucleotide 
5 position and a second of which representing a 3' nucleotide position, where the first nucleotide 
for each nucleotide sequence in SEQ ID NOS: 1-155 is position 1 (therefore, the sequence 
postions for each ORF ID is determined by the numbering of the SEQ ID comprising the ORF 
ID). That is, every combination of a 5' and 3' nucleotide position that a fragment at legist 10 
contiguous nucleotides in length could occupy is included in the invention. At least means a 

10 fragment may be 10 contiguous nucleotide bases in length or any integer between 10 and the 

length of an entire nucleotide sequence of SEQ ID NOS: 1-155 minus 1. Therefore, included in 
the invention are contiguous fragments specified by any 5' and 3' nucleotide base positions of a 
nucleotide sequences of SEQ ID NOS: 1-155 wherein the contiguous fragment is any integer 
between 10 and the length of an entire nucleotide sequence minus 1 . 

15 Further, the invention includes polynucleotides comprising fragments specified by size, 

in nucleotides, rather than by nucleotide positions. The invention includes any fragment size, in 
contiguous nucleotides, selected from integers between 10 and the length of an entire ORF ID or 
SEQ ID NO:, minus 1. Preferred sizes of contiguous nucleotide fragments include 20 
nucleotides, 30 nucleotides, 40 nucleotides, 50 nucleotides. Other preferred sizes of contiguous 

20 nucleotide fragments, which may be useful as diagnostic probes and primers, include fragments 
50-300 nucleotides in length which include, as discussed above, fragment sizes representing each 
integer between 50-300. Larger fragments are also useful according to the present invention 
corresponding to most, if not all, of the nucleotide sequences shown in Tables 1-6 (ORF IDs) 
and SEQ ID NOS: 1-155. The preferred sizes are, of course, meant to exemplify not limit the 

25 present invention as all size fragments, representing any integer between. 10 and the length of an 
entire nucleotide sequence minus 1, of each ORF ID and SEQ ID NO:, are included in the 
invention. 

The present invention also provides for the exclusion of any fragment, specified by 5' 
and 3' base positions or by size in nucleotide bases as described above for any ORF ID or SEQ 
30 ID NOS: 1- 155. Any number of fragments of nucleotide sequences in ORF IDs or SEQ ID 

NOS: 1-155, specified by 5' and 3' base positions or by size in nucleotides, as described above, 
may be excluded from the present invention. 

While the presendy disclosed sequences of SEQ ID NOS: 1-155 are highly accurate, 
35 sequencing techniques are not perfect and, in relatively rare instances, further investigation of a 
fragment or sequence of the invention may reveal a nucleotide sequence error present in a 
nucleotide sequence disclosed in SEQ ID NOS: 1-155. However, once the present invention is 
made available (z.e., once the information in SEQ ID NOS: 1-155 and Tables 1-6 has been made 
available), resolving a rare sequencing error in SEQ ID NOS: 1-155 will be well within the skill 
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of the art. The present disclosure makes available sufficient sequence information to allow any of 
the described contigs or portions thereof to be obtained readily by straightforward application of 
routine techniques. Further sequencing of such polynucleotide may proceed in like manner using 
manual and automated sequencing methods which are employed ubiquitous in the art. Nucleotide 
5 sequence editing software is publicly available. For example, Applied Biosystem's (AB) 
AutoAssembler can be used as an aid during visual inspection of nucleotide sequences. By 
employing such routine techniques potential errors readily may be identified and the correct 
sequence then may be ascertained by targeting further sequencing effort, also of a routine nature, 
to the region containing the potential error. 

10 Even if all of the very rare sequencing errors in SEQ ID NOS: 1-155 were corrected, the 

resulting nucleotide sequences would still be at least 95% identical, nearly all would be at least 
99% identical, and the great majority would be at least 99.9% identical to the nucleotide 
sequences of SEQ ID NOS: 1-155. 

As discussed elsewhere herein, polynucleotides of the present invention readily may be 

15 . obtained by routine application of well known and standard procedures for cloning and 

sequencing DNA. Detailed methods for obtaining libraries and for sequencing are provided 
below, for instance. A wide variety of Borrelia burgdorferi strains that can be used to prepare B. 
burgdorferi genomic DNA for cloning and for obtaining polynucleotides of the present invention 
are available to the public from recognized depository institutions, such as the American Type 

20 Culture Collection ( ATCC ). While the present invention is enabled by the sequences and other 
information herein disclosed, the B. burgdorferi strain that provided the DNA of the present 
Sequence Listing, has been deposited with the ATCC, 10801 University Blvd. Manassas, VA 
201 10-2209, as Deposit No. 202012, on 8 August 1997. The ATCC Deposit is provided merely 
as a convenience to those of skill in the art.. Reference to the deposit is not a waiver of any rights 

25 of the inventors or their assignees in the present subject matter. 

The nucleotide sequences of the genomes from different strains of Borrelia burgdorferi 
differ somewhat. However, the nucleotide sequences of the genomes of all Borrelia burgdorferi 
strains will be at least 95% identical, in corresponding part, to the nucleotide sequences provided 
in SEQ ID NOS: 1-155 and the ORF IDs within. Nearly all will be at least 99% identical and the 

30 great majority will be 99.9% identical. 

The present application is further directed to nucleic acid molecules at least 90%, 95%, 
96%, 97%, 98% or 99% identical to a nucleic acid sequence shown in SEQ ID NOS: 1-155 and 
the ORF IDs within. The above nucleic acid sequences are included irrespective of whether they 
encode a polypeptide having B. burgdorferi activity. This is because even where a particular 

35 nucleic acid molecule does not encode a polypeptide having B. burgdorferi activity, one of skill 
in the art would still know how to use the nucleic acid molecule, for instance, as a hybridization 
probe. Uses of the nucleic acid molecules of the present invention that do not encode a 
polypeptide having S. burgdorferi activity include, inter alia, isolating a B. burgdorferi gene or 
allelic variants thereof from a DNA library, and detecting B. burgdorferi mRNA expression from 
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biological or environmental samples, suspected of containing B. burgdorferi by Northern Blot, 
PCR, or similar analysis. 

Preferred, are nucleic acid moiecules having sequences at least 90%, 95%, 96%, 97%, 
98% or 99% identical to the nucleic acid sequence shown in SEQ ID NOS: 1-155, the ORF IDs, 
5 and the ORF within each ORF ID, which do, in fact, encode a polypeptide having B. burgdorferi 
protein activity. By "a polypeptide having B. burgdorferi activity" is intended polypeptides 
exhibiting activity similar, but not necessarily identical, to an activity of the B. burgdorferi 
protein of the invention, as measured in a particular biological assay suitable for measuring 
activity of the specified protein. 

10 Due to the degeneracy of the genetic code, one of ordinary skill in the art will immediately 

recognize that a large number of the nucleic acid molecules having a sequence at least 90%, 95%, 
96%, 97%, 98%, or 99% identical to the nucleic acid sequences shown in SEQ ID NOS: 1-155, 
the ORF IDs, and the ORF within each ORF ID, will encode a polypeptide having B. burgdorferi 
protein activity. In fact, since degenerate variants of these nucleotide sequences all encode the 

15 same polypeptide, this will be clear to the skilled artisan even without performing the above 
described comparison assay. It will be further recognized in the art that, for such nucleic acid 
molecules that are not degenerate variants, a reasonable number will also encode a polypeptide 
having B. burgdorferi protein activity. This is because the skilled artisan is fully aware of amino 
acid substitutions that are either less likely or not likely to significantly effect protein function 

20 (e.g., replacing one aliphatic amino acid with a second aliphatic amino acid), as further described 
below. 

The biological activity or function of the polypeptides of the present invention are 
expected to be similar or identical to polypeptides from other bacteria that share a high degree of 
structural identity/similarity. Tables 1, 2, 4, and 5 lists accession numbers and descriptions for 

25 the closest matching sequences of polypeptides available through Genbank. It is therefore 

expected that the biological activity or function of the polypeptides of the present invention will 
be similar or identical to those polypeptides from other bacterial genuses, species, or strains listed 
in Tables 1, 2, 4, and 5. 

By a polynucleotide having a nucleotide sequence at least, for example, 95% "identical" 

30 to a reference nucleotide sequence of the present invention, it is intended that the nucleotide 
sequence of the polynucleotide is identical to the reference sequence except that the 
polynucleotide sequence may include up to five point mutations per each 100 nucleotides of the 
reference nucleotide sequence encoding the B. burgdorferi polypeptide. In other words, to 
obtain a polynucleotide having a nucleotide sequence at least 95% identical to a reference 

35 nucleotide sequence, up to 5% of the nucleotides in the reference sequence may be deleted, 

inserted, or substituted with another nucleotide. The query sequence may be an entire sequence 
shown in SEQ ED NOS: 1-155, an ORF ID, or the ORF within each ORF ID, or any fragment 
specified as described herein. 
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As a practical matter, whether any particular nucleic acid molecule or polypeptide is at 
least 90%, 95%, 96%, 97%, 98% or 99% identical to a nucleotide sequence of the presence 
invention can be determined conventionally using known computer programs. A preferred 
method for determining the best overall match between a query sequence (a sequence of the 
5 present invention) and a subject sequence, also referred to as a global sequence alignment, can be 
determined using the FASTDB computer program based on the algorithm of Brutlag et al. See 
Brutlag et al. (1990) Comp. App. Biosci. 6:237-245. In a sequence alignment the query and 
subject sequences are both DNA sequences. An RNA sequence can be compared by first 
converting U's to T's. The result of said global sequence alignment is in percent identity. 
10 Preferred parameters used in a FASTDB alignment of DNA sequences to calculate percent 
identity are: Matrix=Unitary, k-tuple=4, Mismatch Penalty=l, Joining Penalty=30, 
Randomization Group Length=0, Cutoff Score=l, Gap Penalty=5, Gap Size Penalty=0.05, 
Window Size=500 or the lenght of the subject nucleotide sequence, whichever is shorter. 



15 not because of internal deletions, a manual correction must be made to the results. This is 

because the FASTDB program does not account for 5' and 3' truncations of the subject sequence 
when calculating percent identity. For subject sequences truncated at the 5' or 3' ends, relative to 
the query sequence, the percent identity is corrected by calculating the number of bases of the 
query sequence that are 5' and 3' of the subject sequence, which are not matched/aligned, as a 

20 percent of the total bases of the query sequence. Whether a nucleotide is matched/aligned is 
determined by results of the FASTDB sequence alignment. This percentage is then subtracted 
from the percent identity, calculated by the above FASTDB program using the specified 
parameters, to arrive at a final percent identity score. This corrected score is what is used for the 
purposes of the present invention. Only nucleotides outside the 5' and 3' nucleotides of the 

25 subject sequence, as displayed by the FASTDB alignment, which are not matched/aligned with 
the query sequence, are calculated for the purposes of manually adjusting the percent identity 
score. 

For example, a 90 nucleotide subject sequence is aligned to a 100 nucleotide query sequence to 
determine percent identity. The deletions occur at the 5' end of the subject sequence and 

30 therefore, the FASTDB alignment does not show a matched/alignment of the first 10 nucleotides 
at 5' end. The 10 unpaired nucleotides represent 10% of the sequence (number of nucleotides at 
the 5' and 3' ends not matched/total number of nucleotides in the query sequence) so 10% is 
subtracted from the percent identity score calculated by the FASTDB program. If the remaining 
90 nucleotides were perfectly matched the final percent identity would be 90%. In another 

35 example, a 90 nucleotide subject sequence is compared with a 100 nucleotide query sequence. 
This time the deletions are internal deletions so that there are no nucleotides on the 5' or 3' of the 
subject sequence which are not matched/aligned with the query. In this case the percent identity 
calculated by FASTDB is not manually corrected. Once again, only nucleotides 5' and 3' of the 



If the subject sequence is shorter than the query sequence because of 5' or 3' deletions, 
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subject sequence which are not matched/aligned with the query sequence are manually corrected 
for. No other manual corrections are to made for the purposes of the present invention. 

COMPUTER RELATED EMBODIMENTS 

5 The nucleotide sequences provided in SEQ ID NOS: 1-155, including ORF IDs and 

corresponding ORFs, a representative fragment thereof, or a nucleotide sequence at least 95%, 
preferably at least 96%, 97%, 98% or 99%, and most preferably at least 99.9% identical to said 
nucleotide sequences may be "provided" in a variety of mediums to facilitate use thereof. As 
used herein, provided refers to a manufacture, other than an isolated nucleic acid molecule, 

10 which contains a nucleotide sequence of the present invention, a representative fragment thereof, 
or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% 
identical to a polynucleotide of the present invention. Such a manufacture provides a large 
portion of the Borrelia burgdorferi genome and parts thereof (e.g., a Borrelia burgdorferi open 
reading frame (ORF)) in a form which allows a skilled artisan to examine the manufacture using 

1 5 means not directly applicable to examining the Borrelia burgdorferi genome or a subset thereof as 
it exists in nature or in purified form. 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but are 

20 not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD- ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories, such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can be 
used to create a manufacture comprising computer readable medium having recorded thereon a 

25 nucleotide sequence of the present invention. Likewise, it will be clear to those of skill how 

additional computer readable media that may be developed also can be used to create analogous 
manufactures having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on computer 
readable medium. A skilled artisan can readily adopt any of the presently know methods for 

30 recording information on computer readable medium to generate manufactures comprising the 
nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 

35 to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially- available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
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Oracle, or the like. A skilled artisan can readily adapt any number of data-processor structuring 
formats (e.g., text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

Computer software is publicly available which allows a skilled artisan to access sequence 
5 information provided in a computer readable medium. Thus, by providing in computer readable 
form the nucleotide sequences of the present invention (e.g. SEQ ID NOS: 1-155), a 
representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 96%, 
97%, 98%, 99% and most preferably at least 99.9% identical to a sequence of the present 
invention (e.g. SEQ ID NOS: 1-155) enables the skilled artisan routinely to access the provided 
10 sequence information for a wide variety of purposes. 

The examples which follow demonstrate how software which implements the BLAST 
(Altschul et aL y J. Mol Biol 275:403-410 (1990)) and BLAZE (Brutlag etal, Comp. Chem. 
77:203-207 (1993)) search algorithms on a Sybase system was used to identify open reading 
frames (ORFs) within the Borrelia burgdorferi genome which contain homology to ORFs or 
15 proteins from both Borrelia burgdorferi and from other organisms. Among the ORFs discussed 
herein are protein encoding fragments of the Borrelia burgdorferi genome useful in producing 
commercially important proteins, such as enzymes used in fermentation reactions and in the 
production of commercially useful metabolites. 

The present invention further provides systems, particularly computer-based systems, 
20 which contain the sequence information described herein. Such systems are designed to identify, 
among other things, commercially important fragments of the Borrelia burgdorferi genome. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the present 
invention. The minimum hardware means of the computer-based systems of the present 
25 invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based system are suitable for use in the present invention. 

As stated above, the computer-based systems of the present invention comprise a data 
storage means having stored therein a nucleotide sequence of the present invention and the 
30 necessary hardware means and software means for supporting and implementing a search means. 

As used herein, "data storage means" refers to memory which can store nucleotide 
sequence information of the present invention, or a memory access means which can access 
manufactures having recorded thereon the nucleotide sequence information of the present 
invention. 

35 As used herein, "search means" refers to one or more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of the present genomic sequences which match a particular target sequence 
or target motif. A variety of known algorithms are disclosed publicly and a variety of 
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commercially available software for conducting search means are and can be used in the 
computer-based systems of the present invention. Examples of such software includes, but is 
not limited to, MacPattern (EMBL), BLASTN and BLASTX (NCBIA). A skilled artisan can 
readily recognize that any one of the available algorithms or implementing software packages for 
5 conducting homology searches can be adapted for use in the present computer-based systems. 

As used herein, a "target sequence" can be any DNA or amino acid sequence of six or 
more nucleotides or two or more amino acids. A skilled artisan can readily recognize that the 
longer a target sequence is, the less likely a target sequence will be present as a random 
occurrence in the database. The most preferred sequence length of a target sequence is from 

10 about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well 
recognized that searches for commercially important fragments, such as sequence fragments 
involved in gene expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 

15 three-dimensional configuration which is formed upon the folding of the target motif. There are a 
variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzymic active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

20 A variety of structural formats for the input and output means can be used to input and 

output the information in the computer-based systems of the present invention. A preferred 
format for an output means ranks fragments of the Borrelia burgdorferi genomic sequences 
possessing varying degrees of homology to the target sequence or target motif. Such 
presentation provides a skilled artisan with a ranking of sequences which contain various 

25 amounts of the target sequence or target motif and identifies the degree of homology contained in 
the identified fragment. 

A variety of comparing means can be used to compare a target sequence or target motif 
with the data storage means to identify sequence fragments of the Borrelia burgdorferi genome. 
In the present examples, implementing software which implement the BLAST and BLAZE 

30 algorithms, described in Altschul et al, J. Mol. Biol 215: 403-410 (1990), is used to identify 
open reading frames within the Borrelia burgdorferi genome. A skilled artisan can readily 
recognize that any one of the publicly available homology search programs can be used as the 
search means for the computer-based systems of the present invention. Of course, suitable 
proprietary systems that may be known to those of skill also may be employed in this regard. 

35 Figure 1 provides a block diagram of a computer system illustrative of embodiments of 

this aspect of present invention. The computer system 102 includes a processor 106 connected to 
a bus 104. Also connected to the bus 104 are a main memory 108 (preferably implemented as 
random access memory, RAM) and a variety of secondary storage devices 1 10, such as a hard 
drive 1 12 and a removable medium storage device 1 14. The removable medium storage device 
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1 14 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, 
etc. A removable storage medium 1 16 (such as a floppy disk, a compact disk, a magnetic tape, 
etc.) containing control logic and/or data recorded therein may be inserted into the removable 
medium storage device 1 14. The computer system 102 includes appropriate software for reading 
5 the control logic and/or the data from the removable medium storage device 1 14, once it is 
inserted into the removable medium storage device 1 14. 

A nucleotide sequence of the present invention may be stored in a well known manner in 
the main memory 108, any of the secondary storage devices 1 10, and/or a removable storage 
medium 1 16. During execution, software for accessing and processing the genomic sequence 
10 (such as search tools, comparing tools, etc.) reside in main memory 108, in accordance with the 
requirements and operating parameters of the operating system, the hardware system and the 
software program or programs. 

BIOCHEMICAL EMBODIMENTS 

15 Other embodiments of the present invention are directed to isolated fragments of the 

Borrelia burgdorferi genome. The fragments of the Borrelia burgdorferi genome of the present 
invention include, but are not limited to fragments which encode peptides, hereinafter open 
reading frames (ORFs), fragments which modulate the expression of an operably linked ORF, 
hereinafter expression modulating fragments (EMFs) and fragments which can be used to 

20 diagnose the presence of Borrelia burgdorferi in a sample, hereinafter diagnostic fragments 
(DFs). 

As used herein, an "isolated nucleic acid molecule" or an "isolated fragment of the 
Borrelia burgdorferi genome" refers to a nucleic acid molecule possessing a specific nucleotide 
sequence which has been subjected to purification means to reduce, from the composition, the 

25 number of compounds which are normally associated with the composition. Particularly, the 

term refers to the nucleic acid molecules having the sequences set out in SEQ ID NOS: 1-155, to 
representative fragments thereof as described above including ORF IDs and ORFs, to 
polynucleotides at least*95%, preferably at least 96%, 97%, 98%, or 99% and especially 
preferably at least 99.9% identical in sequence thereto, also as set out above. 

30 A variety of purification means can be used to generate the isolated fragments of the 

present invention. These include, but are not limited to methods which separate constituents of a 
solution based on charge, solubility, or size. 

In one embodiment, Borrelia burgdorferi DNA can be enzymatically sheared to produce 
fragments of 15-20 kb in length. These fragments can then be used to generate a Borrelia 

35 burgdorferi library by inserting them into lambda clones as described in the Examples below. 
Primers flanking, for example, an ORF, such as those enumerated in Tables 1-6 can then be 
generated using nucleotide sequence information provided in SEQ ID NOS: 1-155. Well known 
and routine techniques of PCR cloning then can be used to isolate the ORF from the lambda DNA 
library or Borrelia burgdorferi genomic DNA. Thus, given the availability of SEQ ID NOS: 1- 
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155, the information in Tables 1-6, and the information that may be obtained readily by analysis 
of the sequences of SEQ ID NOS: 1-155 using methods set out above, those of skill will be 
enabled by the present disclosure to isolate any ORF-containing or other nucleic acid fragment of 
the present invention. 

5 The isolated nucleic acid molecules of the present invention include, but are not limited to 

single stranded and double stranded DNA, and single stranded RNA. For purposes of 
numbering and reference to polynucleotide and polypeptide sequences the entire sequence of each 
sequence of SEQ ID NOS: 1-155 is included with the first nucleotide being position 1. 
Therefore, for reference purposes the numbering used in the present invention is that provided in 

10 the sequence listing for SEQ ID NOS: 1-155. 

As used herein, an open reading frame (ORF), means a series of nucleotide triplets 
coding for amino acid residues without any termination codons and is a sequence translatable into 
protein. Further, unless specified, the term "ORF" for each ORF ID is defined by the termination 
codon at the 3' end and the 5' most methionine codon, at the 5' end, in frame with said 3' 

15 termination codon. Unless specified, the term "ORF' also refers to a particular polypeptide 

sequence defined by the ORF polynucleotide sequence, wherein the N-terminus is defined by the 
5' most methionine codon in frame with the termination codon at the 3' end of the ORF ID and 
the C-terminus is defined by the last codon before the said 3' termination codon. As used herein, 
an ORF ID represents a sequence without any internal termination codons flanked by termination 

20 codons. 

Tables 1-6 list ORF IDs in the Borrelia burgdorferi genomic contigs of the present 
invention that were identified as putative coding regions by the GeneMark software using 
organism-specific second-order Markov probability transition matrices. It will be appreciated that 
other criteria can be used, in accordance with well known analytical methods, such as those 

25 discussed herein, to generate more inclusive, more restrictive, or more selective lists. 

The B. burgdorferi genome consists of one large linear chromosome containing 
approximately two thirds of its genetic material and multiple extrachromosomal elements 
(approximately 15) containing the remaining one third of its genetic material. SEQ ID NO: 1 
(Contig ID 1 ) is the complete sequence of the large linear B. burgdorferi chromosome. SEQ ID 

30 NOS:2-155 (Contig ID 2-155 respectively) are fragments (contigs) of the extrachromosomal 
elements. Tables 1-3 below relate only to SEQ ID NO: 1 . Tables 4-6 relate to the 
extrachromosomal elements (SEQ ID NOS:2-155). rj 

Table 1 sets out ORF IDs in the Borrelia burgdorferi chromosome of the present 
invention that cover a continuous region of at least 50 bases are 95% or more identical (by 

35 BLAST analysis using default parameters) to a nucleotide sequence available through GenBank 
in July, 1997. 

Table 2 sets out ORF IDs in the Borrelia burgdorferi chromosome of the present 
invention that are not in Table 1 and match, with a BLASTP probability score of 0.01 or less, a 
polypeptide sequence available through GenBank in July, 1997. 
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Table 3 sets out ORF IDs in the Borrelia burgdorferi chromosome of the present 
invention that do not match significantly, by BLASTP analysis, a polypeptide sequence available 
through GenBank in July, 1997. 

Table 4 sets out ORF IDs in the Borrelia burgdorferi extrachromosomal element contigs 
5 of the present invention that over a continuous region of at least 50 bases are 95% or more 
identical (by BLAST analysis) to a nucleotide sequence available through GenBank in July, 
1997. 

Table 5 sets out ORF IDs in the Borrelia burgdorferi extrachromosomal element contigs 
of the present invention that are not in Table 1 and match, with a BLASTP probability score of 

10 0.01 or less, a polypeptide sequence available through GenBank in July, 1997. 

Table 6 sets out ORF IDs in the Borrelia burgdorferi extrachromosomal element contigs 
of the present invention that do not match significantly, by BLASTP analysis, a polypeptide 
sequence available through GenBank in July, 1997. 

In each table, the first and second columns identify the ORF ID by, respectively, contig 

15 number and ORF ED number within the contig; the third column indicates the first nucleotide of 
the ORF ID, counting from the 5' end of the contig strand; and the fourth column indicates the 
last nucleotide of the ORF ID, counting from the 5' end of the contig strand. 

In Tables 1, 2, 4 and 5, column five, lists the Reference for the closest matching 
sequence available through GenBank. These reference numbers are the database accession 

20 numbers commonly used by those of skill in the art, who will be familiar with their 

denominators. Descriptions of the nomenclature are available from the National Center for 
Biotechnology Information. Column seven provides the BLAST identity score from the 
comparison of the ORF ID and the homologous gene; and column nine indicates the length in 
nucleotides of the highest scoring segment pair identified by the BLAST identity analysis. 

25 The concepts of percent identity and percent similarity of two polypeptide sequences is 

well understood in the art. For example, two polypeptides 10 amino acids in length which differ 
at three amino acid positions {e.g., at positions 1, 3 and 5) are said to have a percent identity of 
70%. However, the same two polypeptides would be deemed to have a percent similarity of 
80% if, for example at position 5, the amino acids moieties, although not identical, were 

30 "similar" (i.e., possessed similar biochemical characteristics). As is known in the art, 
substitution of one amino acid for a "similar" amino acid is a conservative substitution. 
Generally, proteins are highly tolerant of conservative substitutions. Many programs for analysis 
of nucleotide or amino acid sequence similarity,. such as fasta and BLAST specifically list percent 
identity of a matching region as an output parameter. Thus, for instance, Tables 1, 2, 4 and 5 

35 herein enumerate the percent identity and similarity of the highest scoring segment pair in each 
ORF and its listed relative. Further details concerning the algorithms and criteria used for 
homology searches are provided below and are described in the pertinent literature highlighted by 
the citations provided below. 
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It will be appreciated that other criteria can be used to generate more inclusive and more 
exclusive listings of the types set out in the tables. As those of skill will appreciate, narrow and 
broad searches both are useful. Thus, a skilled artisan can readily identify ORFs in contigs of the 
Borrelia burgdorferi genome other than those listed in Tables 1 -6, such as ORFs which are 
5 overlapping or encoded by the opposite strand of an identified ORF in addition to those 
ascertainable using the computer-based systems of the present invention. 

As used herein, an "expression modulating fragment, 1 ' EMF, means a series of nucleotide 
molecules which modulates the expression of an operably linked ORF or EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 

10 sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are fragments which induce the expression or an operably linked 
ORF in response to a specific regulatory factor or physiological event. 

EMF sequences can be identified within the contigs of the Borrelia burgdorferi genome 

15 by their proximity to the ORFs provided in Tables 1-6. An intergenic segment, or a fragment of 
the intergenic segment, from about 10 to 200 nucleotides in length, taken from any one of the 
ORFs of Tables 1-6 will modulate the expression of an operably linked ORF in a fashion similar 
to that found with the naturally linked ORF sequence. As used herein, an "intergenic segment" 
refers to fragments of the Borrelia burgdorferi genome which are between two ORF(s) herein 

20 described. EMFs also can be identified using known EMFs as a target sequence or target motif 
in the computer-based systems of the present invention. Further, the two methods can be 
combined and used together. 

The presence and activity of an EMF can be confirmed using an EMF trap vector. An 
EMF trap vector contains a cloning site linked to a marker sequence. A marker sequence encodes 

25 an identifiable phenotype, such as antibiotic resistance or a complementing nutrition auxotrophic 
factor, which can be identified or assayed when the EMF trap vector is placed within an 
appropriate host under appropriate conditions. As described above, a EMF will modulate the 
expression of an operably linked marker sequence. A more detailed discussion of various marker 
sequences is provided below. A sequence which is suspected as being an EMF is cloned in all 

30 three reading frames in one or more restriction sites upstream from the marker sequence in the 
EMF trap vector. The vector is then transformed into an appropriate host using known 
procedures and the phenotype of the transformed host in examined under appropriate conditions. 
As described above, an EMF will modulate the expression of an operably linked marker 
sequence. 

35 As used herein, a "diagnostic fragment," DF, means a series of nucleotide molecules 

which selectively hybridize to Borrelia burgdorferi sequences. DFs can be readily identified by 
identifying unique sequences within contigs of the Borrelia burgdorferi genome, such as by 
using well-known computer analysis software, and by generating and testing probes or 



WO 98/58943 PCT/US98/12764 

amplification primers consisting of the DF sequence in an appropriate diagnostic format which 

determines amplification or hybridization selectivity. 

The sequences falling within the scope of the present invention are not limited to the 

specific sequences herein described, but also include allelic and species variations thereof. Allelic 
5 and species variations can be routinely determined by comparing the sequences provided in SEQ 

ID NOS: 1-155, ORF IDs and ORFs within, a representative fragment thereof, or a nucleotide 

sequence at least 99% and preferably 99.9% identical to SEQ ID NOS: 1-155, ORF IDs and 

ORFs within, with a sequence from another isolate of the same species. Furthermore, »to 

accommodate codon variability, the invention includes nucleic acid molecules coding for the same 
10 amino acid sequences as do the specific ORFs disclosed herein. In other words, in the coding 

region of an ORF, substitution of one codon for another which encodes the same amino acid is 

expressly contemplated. 

Any specific sequence disclosed herein can be readily screened for errors by resequencing 

a particular fragment, such as an ORF, in both directions (i.e., sequence both strands). 
15 Alternatively, error screening can be performed by sequencing corresponding polynucleotides of 

Borrelia burgdorferi origin isolated by using part or all of the fragments in question as a probe or 

primer. 

Each of the ORF IDs and ORFs of the Borrelia burgdorferi genome disclosed in Tables 1- 
6, and the EMFs found 5' to the ORF IDs, can be used as polynucleotide reagents in numerous 

20 ways. For example, the sequences can be used as diagnostic probes or diagnostic amplification 
primers to detect the presence of a specific microbe in a sample, particularly Borrelia burgdorferi. 
Especially preferred in this regard are ORF IDs and ORFs such as those of Tables 3 and 6, which 
do not match previously, characterized sequences from other organisms and thus are most likely 
to be highly selective for Borrelia burgdorferi. Also particularly preferred are ORF IDs and 

25 ORFs that can be used to distinguish between strains of Borrelia burgdorferi, particularly those 
that distinguish medically important strain, such as drug-resistant strains. 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. Triple helix- 

30 formation optimally results in a shut-off of RNA transcription from DNA, while antisense RNA 
hybridization blocks translation of an mRNA molecule into polypeptide. Information from the 
sequences of the present invention can be used to design antisense and triple helix-forming 
oligonucleotides. Polynucleotides suitable for use in these methods are usually 20 to 40 bases in 
length and are designed to be complementary to a region of the gene involved in transcription, for 

35 triple-helix formation, or to the mRNA itself, for antisense inhibition. Both techniques have been 
demonstrated to be effective in model systems, and the requisite techniques are well known and 
involve routine procedures. Triple helix techniques are discussed in, for example, Lee et al, 
Nucl Acids Res. 6:3073 (1979); Cooney et al. n Science 241:456 (1988); and Dervan et al. y 
Science 257:1360 (1991). Antisense techniques in general are discussed in, for instance, Okano, 



WO 98/58943 PCT/US98/1 2764 

J. Neurochem. 56:560 (1991) and Oligodeoxynucleotides as Antisense Inhibitors of Gene 
Expression, CRC Press, Boca Raton, FL (1988)). 

The present invention further provides recombinant constructs comprising one or more 
fragments of the Borrelia burgdorferi genomic fragments and contigs of the present invention. 
5 Certain preferred recombinant constructs of the present invention comprise a vector, such as a 
plasmid or viral vector, into which a fragment of the Borrelia burgdorferi genome has been 
inserted, in a forward or reverse orientation. In the case of a vector comprising one of the ORF 
IDs or ORFs of the present invention, the vector may further comprise regulatory sequences, 
including for example, a promoter, operably linked to the ORF ID or ORF. For vectors 

10 comprising the EMFs of the present invention, the vector may further comprise a marker 
sequence or heterologous ORF ID or ORF operably linked to the EMF. 

Large numbers of suitable vectors and promoters are known to those of skill in the art and 
are commercially available for generating the recombinant constructs of the present invention. 
The following vectors are provided by way of example. Useful bacterial vectors include 

15 phagescript, PsiX174, pBluescript SK, pBS KS, pNH8a, pNH16a, pNH18a, pNH46a 

(available from Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (available from 
Pharmacia); pQE vectors (available from Promega). Useful eukaryotic vectors include pWLneo, 
pSV2cat, pOG44, pXTl, pSG (available from Stratagene) pSVK3, pBPV, pMSG, pSVL 
(available from Pharmacia). 

20 Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late S V40, LTRs from retrovirus, and mouse metallothionein- 1. Selection of 

25 the appropriate vector and promoter is well within the level of ordinary skill in the art. 

The present invention further provides host cells containing any one of the isolated 
fragments of the Borrelia burgdorferi genomic fragments and contigs of the present invention, 
wherein the fragment has been introduced into the host cell using known methods. The host cell 
can be a higher eukaryotic host cell, such as a mammalian cell, a lower eukaryotic host cell, such 

30 as a yeast cell, or a procaryotic cell, such as a bacterial cell. 

A polynucleotide of the present invention, such as a recombinant construct comprising an 
ORF of the present invention, may be introduced into the host by a variety of well established 
techniques that are standard in the art, such as calcium phosphate transfection, DEAE, dextran 
mediated transfection and electroporation, which are described in, for instance, Davis, L. et aL, 

35 BASIC METHODS IN MOLECULAR BIOLOGY (1986). 

A host cell containing one of the fragments of the Borrelia burgdorferi genomic fragments 
and contigs of the present invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 
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The present invention further provides isolated polypeptides encoded by the nucleic acid 
fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
5 the degeneracy of the Genetic Code, encode an identical polypeptide sequence. 

Preferred nucleic acid fragments of the present invention are the ORF IDs depicted in 
Tables 2, 3, 5 and 6, and ORFs witin, which encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 

10 sequence can be synthesized using commercially available peptide synthesizers. This is 

particularly useful in producing small peptides and fragments of larger polypeptides. Such short 
fragments as may be obtained most readily by synthesis are useful, for example, in generating 
antibodies against the native polypeptide, as discussed further below. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 

15 naturally produce the polypeptide or protein. One skilled in the art can readily employ well- 
known methods for isolating polypeptides and proteins to isolate and purify polypeptides or 
proteins of the present invention produced naturally by a bacterial strain, or by other methods. 
Methods for isolation and purification that can be employed in this regard include, but are not 
limited to, immunochromatography, HPLC, size-exclusion chromatography, ion-exchange 

20 chromatography, and immuno-affinity chromatography. 

The polypeptides and proteins of the present invention also can be purified from cells 
which have been altered to express the desired polypeptide or protein. As used herein, a cell is 
said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

25 which the cell normally produces at a lower level. Those skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The polypeptides of the present invention are preferably provided in an isolated form, and 

30 preferably are substantially purified. A recombinantly produced version of the B. burgdorferi 

polypeptide can be substantially purified by the one-step method described by Smith et al. (1988) 
Gene 67:31-40. Polypeptides of the invention also can be purified from natural or recombinant 
sources using antibodies directed against the polypeptides of the invention in methods which are 
well known in the art of protein purification. 

35 The invention further provides for isolated B. burgdorferi polypeptides comprising an 

amino acid sequence selected from the group including: (a) the amino acid sequence of a full- 
length B. burgdorferi polypeptide having the complete amino acid sequence from the first 
methionine codon to the termination codon of each sequence listed in SEQ ID NOS: 1-155, 
wherein said termination codon is at the end of each SEQ ID NO: and said first methionine is the 
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first methionine in frame with said termination codon; and (b) the amino acid sequence of a full- 
length B. burgdorferi polypeptide having the complete amino acid sequence in (a) excepting the 
N-terminal methionine. 

The polypeptides of the present invention also include polypeptides having an amino acid 
5 sequence at least 80% identical, more preferably at least 90% identical, and still more preferably 
95%, 96%, 97%, 98% or 99% identical to those described in (a) and (b) above. 

The present invention is further directed to polynucleotides encoding portions or 
fragments of the amino acid sequences described herein as well as to portions or fragments of the 
isolated amino acid sequences described herein. Fragments include portions of the amino acid 

10 sequences described herein at least 5 contiguous amino acid in length and selected from any two 
integers, one of which representing an N-terminal position and another representing a C-terminal 
position. The initiation codon of the ORFs of the present invention is position 1. The initiation 
codon (positon 1) for purposes of the present invention is the first methionine codon of each 
ORF ID which is in frame with the termination codon at the end of each said sequence. Every 

15 combination of a N-terminal and C-terminal position that a fragment at least 5 contiguous amino 
acid residues in length could occupy, on any given ORF is included in the invention, he., from 
initiation codon up to the termination codon. "At least" means a fragment may be 5 contiguous 
amino acid residues in length or any integer between 5 and the number of residues in an ORF, 
minus 1 . Therefore, included in the invention are contiguous fragments specified by any N- 

20 terminal and C-terminal positions of amino acid sequence set forth in SEQ ID NOS: 1-155 or 

Tables 1-6 wherein the contiguous fragment is any integer between 5 and the number of residues 
in an ORF minus 1. 

Further, the invention includes polypeptides comprising fragments specified by size, in 
amino acid residues, rather than by N-terminal and C-terminal positions. The invention includes 

25 any fragment size, in contiguous amino acid residues, selected from integers between 5 and the 
number of residues in an ORF, minus 1. Preferred sizes of contiguous polypeptide fragments 
include about 5 amino acid residues, about 10 amino acid residues, about 20 amino acid residues, 
about 30 amino acid residues, about 40 amino acid residues, about 50 amino acid residues, about 
100 amino acid residues, about 200 amino acid residues, about 300 amino acid residues, and 

30 about 400 amino acid residues. The preferred sizes are, of course, meant to exemplify, not limit, 
the present invention as all size fragments representing any integer between 5 and the number of 
residues in a full length sequence minus 1 are included in the invention. The present invention 
also provides for the exclusion of any fragments specified by N-terminal and C-terminal 
positions or by size in amino acid residues as described above. Any number of fragments 

35 specified by N-terminal and C-terminal positions or by size in amino acid residues as described 
above may be excluded. 

The above fragments need not be active since they would be useful, for example, in 
immunoassays, in epitope mapping, epitope tagging, to generate antibodies to a particular portion 
of the protein, as vaccines, and as molecular weight markers. 
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Further polypeptides of the present invention include polypeptides which have at least 
90% similarity, more preferably at least 95% similarity, and still more preferably at least 96%, 
97%, 98% or 99% similarity to those described above. 

A further embodiment of the invention relates to a polypeptide which comprises the amino 
5 acid sequence of a B. burgdorferi polypeptide having an amino acid sequence which contains at 
least one conservative amino acid substitution, but not more than 50 conservative amino acid 
substitutions, not more than 40 conservative amino acid substitutions, not more than 30 
conservative amino acid substitutions, and not more than 20 conservative amino acid 
substitutions. Also provided are polypeptides which comprise the amino acid sequence of a B. 

10 burgdorferi polypeptide, having at least one, but not more than 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 
conservative amino acid substitutions. 

By a polypeptide having an amino acid sequence at least, for example, 95% "identical" to 
a query amino acid sequence of the present invention, it is intended that the amino acid sequence 
of the subject polypeptide is identical to the query sequence except that the subject polypeptide 

15 sequence may include up to five amino acid alterations per each 100 amino acids of the query 

amino acid sequence. In other words, to obtain a polypeptide having an amino acid sequence at 
least 95% identical to a query amino acid sequence, up to 5% of the amino acid residues in the 
subject sequence may be inserted, deleted, (indels) or substituted with another amino acid. These 
alterations of the reference sequence may occur at the amino or carboxy terminal positions of the 

20 reference amino acid sequence or anywhere between those terminal positions, interspersed either 
individually among residues in the reference sequence or in one or more contiguous groups 
within the reference sequence. 

As a practical matter, whether any particular polypeptide is at least 90%, 95%, 96%, 
97%, 98% or 99% identical to the ORF amino acid sequences encoded by the sequences of SEQ 

25 ID NOS: 1-155, as described hererin, can be determined conventionally using known computer 
programs. A preferred method for determining the best overall match between a query sequence 
(a sequence of the present invention) and a subject sequence, also referred to as a global sequence 
alignment, can be determined using the FASTDB computer program based on the algorithm of 
Brutlag et al., (1990) Comp. App. Biosci. 6:237-245. In a sequence alignment the query and 

30 subject sequences are both amino acid sequences. The result of said global sequence alignment is 
in percent identity. Preferred parameters used in a FASTDB amino acid alignment are: 
Matrix=PAM 0, k-tuple=2, Mismatch Penalty=l, Joining Penalty =20, Randomization Group 
Length=0, Cutoff Score=l, Window Size=sequence length, Gap Penalty=5, Gap Size 
Penalty=0.05, Window Size=500 or the length of the subject amino acid sequence, whichever is 

35 shorter. 

If the subject sequence is shorter than the query sequence due to N- or C-terminal 
deletions, not because of internal deletions, the results, in percent identity, must be manually 
corrected. This is because the FASTDB program does not account for N- and C-terminal 
truncations of the subject sequence when calculating global percent identity. For subject 
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sequences truncated at the N- and C-termini, relative to the query sequence, the percent identity is 
corrected by calculating the number of residues of the query sequence that are N- and C-terminal 
of the subject sequence, which are not matched/aligned with a corresponding subject residue, as a 
percent of the total bases of the query sequence. Whether a residue is matched/aligned is 
5 determined by results of the FASTDB sequence alignment. This percentage is then subtracted 
from the percent identity, calculated by the above FASTDB program using the specified 
parameters, to arrive at a final percent identity score. This final percent identity score is what is 
used for the purposes of the present invention. Only residues to the N- and C-termini of the 
subject sequence, which are not matched/aligned with the query sequence, are considered for the 

10 purposes of manually adjusting the percent identity score. That is, only query amino acid 
residues outside the farthest N- and C-terminal residues of the subject sequence. 

For example, a 90 amino acid residue subject sequence is aligned with a 100 residue 
query sequence to determine percent identity. The deletion occurs at the N-terminus of the 
subject sequence and therefore, the FASTDB alignment does not match/align with the first 10 

15 residues at the N-terminus. The 10 unpaired residues represent 10% of the sequence (number of 
residues at the N- and C- termini not matched/total number of residues in the query sequence) so 
10% is subtracted from the percent identity score calculated by the FASTDB program. If the 
remaining 90 residues were perfectly matched the final percent identity would be 90%. In 
another example, a 90 residue subject sequence is compared with a 100 residue query sequence. 

20 This time the deletions are internal so there are no residues at the N- or C-termini of the subject 
sequence which are not matched/aligned with the query. In this case the percent identity 
calculated by FASTDB is not manually corrected. Once again, only residue positions outside the 
N- and C-terminal ends of the subject sequence, as displayed in the FASTDB alignment, which 
are not matched/aligned with the query sequence are manually corrected. No other manual 

25 corrections are to made for the purposes of the present invention. 

The above polypeptide sequences are included irrespective of whether they have their 
normal biological activity. This is because even where a particular polypeptide molecule does not 
have biological activity, one of skill in the art would still know how to use the polypeptide, for 
instance, as a vaccine or to generate antibodies. Other uses of the polypeptides of the present 

30 invention that do not have B. burgdorferi activity include, inter alia, as epitope tags, in epitope 
mapping, and as molecular weight markers on SDS-PAGE gels or on molecular sieve gel 
filtration columns using methods known to those of skill in the art. 

As described below, the polypeptides of the present invention can also be used to raise polyclonal 
and monoclonal antibodies, which are useful in assays for detecting B. burgdorferi protein 
35 expression or as agonists and antagonists capable of enhancing or inhibiting B. burgdorferi 
protein function. Further, such polypeptides can be used in the yeast two-hybrid system to 
"capture" B. burgdorferi protein binding proteins which are also candidate agonists and 
antagonists according to the present invention. See, e.g., Fields et al. (1989) Nature 
340:245-246. ? * v - 
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Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, CV-l cell, 
COS cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. The most 
preferred cells are those which do not normally express the particular polypeptide or protein or 
5 which expresses the polypeptide or protein at low natural level. 

"Recombinant," as used herein, means that a polypeptide or protein is derived from 
recombinant (e.g., microbial or mammalian) expression systems. "Microbial" refers to 
recombinant polypeptides or proteins made in bacterial or fungal (e.g., yeast) expression 
systems. As a product, "recombinant microbial"defines a polypeptide or protein essentially free 
10 of native endogenous substances and unaccompanied by associated native glycosylation. 
Polypeptides or proteins expressed in most bacterial cultures, e.g., E. coli, will be free of 
glycosylation modifications; polypeptides or proteins expressed in yeast will have a glycosylation 
pattern different from that expressed in mammalian cells. 



1 5 DN A segments encoding the polypeptides and proteins provided by this invention are assembled 
from fragments of the Borrelia burgdorferi genome and short oligonucleotide linkers, or from a 
series of oligonucleotides, to provide a synthetic gene which is capable of being expressed in a 
recombinant transcriptional unit comprising regulatory elements derived from a microbial or viral 
operon. 

20 Recombinant expression vehicle or vector" refers to a plasmid or phage or virus or 

vector, for expressing a polypeptide from a DNA (RNA) sequence. The expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic regulatory elements 
necessary for gene expression in the host, including elements required to initiate and maintain 
transcription at a level sufficient for suitable expression of the desired polypeptide, including, for 

25 example, promoters and, where necessary, an enhancer and a polyadenylation signal; (2) a 

structural or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate signals to initiate translation at the beginning of the desired coding region and 
terminate translation at its end. Structural units intended for use in yeast or eukaryotic expression 
systems preferably include a leader sequence enabling extracellular secretion of translated protein 

30 by a host cell. Alternatively, where recombinant protein is expressed without a leader or 

transport sequence, it may include an N-terminal methionine residue. This residue may or may 
not be subsequently cleaved from the expressed recombinant protein to provide a final product. 

"Recombinant expression system" means host cells which have stably integrated a 
recombinant transcriptional unit into chromosomal DNA or carry the recombinant transcriptional 

35 unit extra chromosomally. The cells can be prokaryotic or eukaryotic. Recombinant expression 
systems as defined herein will express heterologous polypeptides or proteins upon induction of 
the regulatory elements linked to the DNA segment or synthetic gene to be expressed. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under 
the control of appropriate promoters. Cell-free translation systems can also be employed to 



it* 



Nucleotide sequence" refers to a heteropolymer of deoxy ribonucleotides. Generally, 
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produce such proteins using RNAs derived from the DNA constructs of the present invention. 
Appropriate cloning and expression vectors for use with prokaryotic and eukaryotic hosts are 
described in Sambrook et al. 9 Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of 
5 which is hereby incorporated by reference in its entirety. 

Generally, recombinant expression vectors will include origins of replication and 
selectable markers permitting transformation of the host cell, e.g., the ampicillin resistance gene 
of E. coli and 5. cerevisiae TRP1 gene, and a promoter derived from a highly expressed gene to 
direct transcription of a downstream structural sequence. Such promoters can be derived from 

10 operons encoding glycolytic enzymes such as 3- phosphoglycerate kinase (PGK), alpha-factor, 
acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 

15 fusion protein including an N-terminal identification peptide imparting desired characteristics, 
e.g., stabilization or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 

20 more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
vector and, when desirable, provide amplification within the host. 

Suitable prokaryotic hosts for transformation include strains of E. coli, B. subtilis, 
Salmonella typhimurium and various species within the genera Pseudomonas and Streptomyces. 
Others may, also be employed as a matter of choice. 

25 As a representative but non-limiting example, useful expression vectors for bacterial use 

can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (available form 
Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (available from Promega Biotec, 

30 Madison, WI, USA). These pBR322 "backbone" sections are combined with an appropriate 
promoter and the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to an 
appropriate cell density, the selected promoter, where it is inducible, is derepressed or induced by 
appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 

35 additional period to provide for expression of the induced gene product. Thereafter cells are 

typically harvested, generally by centrifugation, disrupted to release expressed protein, generally 
by physical or chemical means, and the resulting crude extract is retained for further purification. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
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fibroblasts, described in Gluzman, Cell 23:175 (1981), and other cell lines capable of expressing 
a compatible vector, for example, the C127, 3T3, CHO, HeLa and BHK cell lines. 

Mammalian expression vectors will comprise an origin of replication, a suitable promoter 
and enhancer, and also any necessary ribosome binding sites, polyadenylation site, splice donor 
5 and acceptor sites, transcriptional termination sequences, and 5' flanking non transcribed 

sequences. DNA sequences derived from the SV40 viral genome, for example, SV40 origin, 
early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required 
nontranscribed genetic elements. 

Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by 

10 initial extraction from cell pellets, followed by one or more salting-out, aqueous ion exchange or 
size exclusion chromatography steps. Microbial cells employed in expression of proteins can be 
disrupted by any convenient method, including freeze-thaw cycling, sonication, mechanical 
disruption, or use of cell lysing agents. Protein refolding steps can be used, as necessary, in 
completing configuration of the mature protein. Finally, high performance liquid 

1 5 chromatography (HPLC) can be employed for final purification steps. 

The present invention further includes isolated polypeptides, proteins and nucleic acid 
molecules which are substantially equivalent to those herein described. As used herein, 
substantially equivalent can refer both to nucleic acid and amino acid sequences, for example a 
mutant sequence, that varies from a reference sequence by one or more substitutions, deletions, 

20 or additions, the net effect of which does not result in an adverse functional dissimilarity between 
reference and subject sequences. Particularly preferred in this regard are conservative 
substitutions, known to those of skill in the art. For purposes of the present invention, 
sequences having equivalent biological activity, and equivalent expression characteristics are 
considered substantially equivalent. For purposes of determining equivalence, truncation of the 

25 -mature sequence (e.g., removal of leader sequence(s)) should be disregarded. 

The invention further provides methods of obtaining homologs from other strains of 
Borrelia burgdorferi, of the fragments of the Borrelia burgdorferi genome of the present 
invention and homologs of the proteins encoded by the ORFs of the present invention. As used 
herein, a sequence or protein of Borrelia burgdorferi is defined as a homolog of a fragment of the 

30 Borrelia burgdorferi fragments or contigs or a protein encoded by one of the ORFs of the present 
invention, if it shares significant homology to one of the fragments of the Borrelia burgdorferi 
genome of the present invention or a protein encoded by one of the ORFs of the present 
invention. Specifically, by using the sequence disclosed herein as a probe or as primers, and 
techniques such as PCR cloning and colony/plaque hybridization, one skilled in the art can obtain 

35 homologs. 

As used herein, two nucleic acid molecules or proteins are said to "share significant 
homology" if the two contain regions which possess greater than 85% sequence (amino acid or 
nucleic acid) homology. Preferred homologs in this regard are those with more than 90% 
homology. Especially preferred are those with 95% or more homology. Among especially 
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preferred homologs those with 96, 97%, 98%, 99% or more homology are particularly 
preferred. The most preferred homologs among these are those with 99.9% homology or more. 
It will be understood that, among measures of homology, identity is particularly preferred in this 
regard. 

Region specific primers or probes derived from the nucleotide sequence provided in SEQ 
ID NOS: 1-155 or from a nucleotide sequence at least 95%, particularly at least 96%, 97%, 98% 
or 99%, especially at least 99.5% identical to a sequence of SEQ ID NOS: 1-155 can be used to 
prime DNA synthesis and PCR amplification, as well as to identify colonies containing cloned 
DNA encoding a homolog! Methods suitable to this aspect of the present invention are well 
known and have been described in great detail in many publications such as, for example, Innis 
et al, PCR Protocols, Academic Press, San Diego, CA (1990)). 

When using primers derived from SEQ ID NOS: 1-155 or from a nucleotide sequence 
having an aforementioned identity to a sequence of SEQ ID NOS: 1-155, one skilled in the art will 
recognize that by employing high stringency conditions (e.g., annealing at 50-60°C in 6X SSPC 
and 50% formamide, and washing at 50- 65°C in 0.5X SSPC) only sequences which are greater 
than 75% homologous to the primer will be amplified. By employing lower stringency 
conditions (e.g., hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 
42°C in 0.5X SSPC), sequences which are greater than 40-50% homologous to the primer will 
also be amplified. 

When using DNA probes derived from SEQ ID NOS: 1-155, or from a nucleotide 
sequence having an aforementioned identity to a sequence of SEQ ID NOS: 1-155 , for 
colony/plaque hybridization, one skilled in the art will recognize that by employing high 
stringency conditions (e.g., hybridizing at 50- 65°C in 5X SSPC and 50% formamide, and 
washing at 50- 65°C in 0.5X SSPC), sequences having regions which are greater than 90% 
homologous to the probe can be obtained, and that by employing lower stringency conditions 
(e.g., hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0.5X 
SSPC), sequences having regions which are greater than 35-45% homologous to the probe will 
be obtained. 

Any organism can be used as the source for homologs of the present invention so long as 
the organism naturally expresses such a protein or contains genes encoding the same. The most 
preferred organism for isolating homologs are bacteria which are closely related to Borrelia 
burgdorferi. 

ILLUSTRATIVE USES OF COMPOSITIONS OF THE INVENTION 

Each ORF of the ORF IDs provided in Tables 1, 2, 4 and 5 is identified with a function 
by homology to a known gene or polypeptide. As a result, one skilled in the art can use the 
polypeptides of the present invention for commercial, therapeutic and industrial purposes 
consistent with the type of putative identification of the polypeptide. Such identifications permit 
one skilled in the art to use the Borrelia burgdorferi ORFs in a manner similar to the known type 
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of sequences for which the identification is made; for example, to ferment a particular sugar 
source or to produce a particular metabolite. A variety of reviews illustrative of this aspect of the 
invention are available, including the following reviews on the industrial use of enzymes, for 
example, BIOCHEMICAL ENGINEERING AND BIOTECHNOLOGY HANDBOOK, 2nd 
5 Ed., MacMillan Publications, Ltd. NY (1991) and BIOCATALYSTS IN ORGANIC 

SYNTHESES, Tramper et al., Eds., Elsevier Science Publishers, Amsterdam, The Netherlands 
(1985). A variety of exemplary uses that illustrate this and similar aspects of the present 
invention are discussed below. 

10 1. Biosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic reactions 
involved in intermediary and macromolecular metabolism* the biosynthesis of small molecules, 
cellular processes and other functions includes enzymes involved in the degradation of the 
intermediary products of metabolism, enzymes involved in central intermediary metabolism, 

15 enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation, 
enzymes involved in ATP proton motor force conversion, enzymes involved in broad regulatory 
function, enzymes involved in amino acid synthesis, enzymes involved in nucleotide synthesis, 
enzymes involved in cofactor and vitamin synthesis, can be used for industrial biosynthesis. 

The various metabolic pathways present in Borrelia burgdorferi can be identified based on 

20 absolute nutritional requirements as well as by examining the various enzymes identified in Table 
1-6 and SEQ ID NOS: 1-155. 

Of particular interest are polypeptides involved in the degradation of intermediary 
metabolites as well as non-macromolecular metabolism. Such enzymes include amylases, 
glucose oxidases, and catalase. 

25 Proteolytic enzymes are another class of commercially important enzymes. Proteolytic 

enzymes find use in a number of industrial processes including the processing of flax and other 
vegetable fibers, in the extraction, clarification and depectinization of fruit juices, in the extraction 
of vegetables' oil and in the maceration of fruits and vegetables to give unicellular fruits. A 
detailed review of the proteolytic enzymes used in the food industry is provided in Rombouts et 

30 al , Symbiosis 21:19 (1986) and Voragen et al in Biocatalysts In Agricultural Biotechnology, 
Whitaker et al, Eds., American Chemical Society Symposium Series 389:93 (1989) . 

The metabolism of sugars is an important aspect of the primary metabolism of Borrelia 
burgdorferi. Enzymes involved in the degradation of sugars, such as, particularly, glucose, 
galactose, fructose and xylose, can be used in industrial fermentation. Some of the important 

35 sugar transforming enzymes, from a commercial viewpoint, include sugar isomerases such as 
glucose isomerase. Other metabolic enzymes have found commercial use such as glucose 
oxidases which produces ketogulonic acid (KGA). KGA is an intermediate in the commercial 
production of ascorbic acid using the Reichstein's procedure, as described in Krueger et al., 
Biotechnology 6(A) . Rhine etal, Eds., Verlag Press, Weinheim, Germany (1984). 
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Glucose oxidase (GOD) is commercially available and has been used in purified form as 
well as in an immobilized form for the deoxygenation of beer. See, for instance, Hartmeir et al. y 
Biotechnology Letters 7:21 (1979). The most important application of GOD is the industrial 
scale fermentation of gluconic acid. Market for gluconic acids which are used in the detergent, 
5 textile, leather, photographic, pharmaceutical, food, feed and concrete industry, as described, for 
example, in Bigelis et al % beginning on page 357 in GENE MANIPULATIONS AND FUNGI; 
Benett et al t Eds., Academic Press, New York (1985). In addition to industrial applications, 
GOD has found applications in medicine for quantitative determination of glucose in body fluids 
recently in biotechnology for analyzing syrups from starch and cellulose hydrosylates. This 
10 application is described in Owusu et aL, Biochem. et Biophysica. Acta. 572:83 (1986), for 
instance. 

The main sweetener used in the world today is sugar which comes from sugar beets and 
sugar cane. In the field of industrial enzymes, the glucose isomerase process shows the largest 
expansion in the market today. Initially, soluble enzymes were used and later immobilized 

15 enzymes were developed (Krueger et aL, Biotechnology, The Textbook of Industrial 

Microbiology, Sinauer Associated Incorporated, Sunderland, Massachusetts (1990)). Today, the 
use of glucose- produced high fructose syrups is by far the largest industrial business using 
immobilized enzymes. A review of the industrial use of these enzymes is provided by 
Jorgensen, Starch 40:307 (1988). 

20 Proteinases, such as alkaline serine proteinases, are used as detergent additives and thus 

represent one of the largest volumes of microbial enzymes used in the industrial sector. Because 
of their industrial importance, there is a large body of published and unpublished information 
regarding the use of these enzymes in industrial processes. (See Faultman et aL, Acid Proteases 
Structure Function and Biology, Tang, J., ed., Plenum Press, New York (1977) and Godfrey et 

25 a/., Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner et al. 9 Report 
Industrial Enzymes by 1990, Hel Hepner & Associates, London (1986)). 

Another class of commercially usable proteins of the present invention are the microbial 
lipases, described by, for instance, Macrae et al. 9 Philosophical Transactions of the Chiral 
Society of London 310:221 (1985) and Poserke, Journal of the American Oil Chemist Society 

30 61: 1758 (1984). A major use of lipases is in the fat and oil industry for the production of neutral 
glycerides using lipase catalyzed inter-esterification of readily available triglycerides. Application 
of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the 
course of the washing procedures. 

The use of enzymes, and in particular microbial enzymes, as catalyst for key steps in the 

35 synthesis of complex organic molecules is gaining popularity at a great rate. One area of great 

interest is the preparation of chiral intermediates. Preparation of chiral intermediates is of interest 
to a wide range of synthetic chemists particularly those scientists involved with the preparation of 
new pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies et aL, Recent 
Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, Boca Raton, 
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Florida (1990)). The following reactions catalyzed by enzymes are of interest to organic 
chemists: hydrolysis of carboxylic acid esters, phosphate esters, amides and nitriles, 
esterification reactions, trans-esterification reactions, synthesis of amides, reduction of alkanones 
and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to 
5 sulfoxides, and carbon bond forming reactions such as the aldol reaction. 



respective advantages and disadvantages of using a microorganism as opposed to an isolated 
enzyme. Pros and cons of using a whole cell system on the one hand or an isolated partially 
10 purified enzyme on the other hand, has been described in detail by Bud et ai, Chemistry in 
Britain (1987), p. 127. 

Amino transferases, enzymes involved in the biosynthesis and metabolism of amino 
acids, are useful in the catalytic production of amino acids. The advantages of using microbial 
based enzyme systems is that the amino transferase enzymes catalyze the stereo- selective 
.15 synthesis of only L-amino acids and generally possess uniformly high catalytic rates. A 

description of the use of amino transferases for amino acid production is provided by Roselle- 
David, Methods ofEnzymology 136:479 (1987). 

Another category of useful proteins encoded by the ORFs of the present invention include 
enzymes involved in nucleic acid synthesis, repair, and recombination. 



2. Generation of Antibodies 

As described here, the proteins of the present invention, as well as homologs thereof, can 
be used in a variety of procedures and methods known in the art which are currently applied to 
other proteins. The proteins of the present invention can further be used to generate an antibody 

25 which selectively binds the protein. 

B. burgdorferi protein-specific antibodies for use in the present invention can be raised 
against the intact B. burgdorferi protein or an antigenic polypeptide fragment thereof, which may 
be presented together with a carrier protein, such as an albumin, to an animal system (such as 
rabbit or mouse) or, if it is long enough (at least about 25 amino acids), without a carrier. 

30 As used herein, the term "antibody" (Ab) or "monoclonal antibody" (Mab) is meant to 

include intact molecules, single chain whole antibodies, and antibody fragments. Antibody 
fragments of the present invention include Fab and F(ab')2 and other fragments including single- 
chain Fvs (scFv) and disulfide-linked Fvs (sdFv). Also included in the present invention are 
chimeric and humanized monoclonal antibodies and polyclonal antibodies specific for the 

35 polypeptides of the present invention. The antibodies of the present invention may be prepared 
by any of a variety of methods. For example, cells expressing a polypeptide of the present 
invention or an antigenic fragment thereof can be administered to an animal in order to induce the 
production of sera containing polyclonal antibodies. For example, a preparation of B. 
burgdorferi polypeptide or fragment thereof is prepared and purified to render it substantially free 



When considering the use of an enzyme encoded by one of the ORFs of the present 
invention for biotransformation and organic synthesis it is sometimes necessary to consider the 
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of natural contaminants. Such a preparation is then introduced into an animal in order to produce 
polyclonal antisera of greater specific activity. 

In a preferred method, the antibodies of the present invention are monoclonal antibodies 
or binding fragments thereof. Such monoclonal antibodies can be prepared using hybridoma 
5 technology. See, e.g., Harlow et al. f ANTIBODIES: A LABORATORY MANUAL, (Cold 
Spring Harbor Laboratory Press, 2nd ed. 1988); Hammerling, et al., in: MONOCLONAL 
ANTIBODIES AND T-CELL HYBRIDOMAS 563-681 (Elsevier, N.Y., 1981). Fab and 
F(ab')2 fragments may be produced by proteolytic cleavage, using enzymes such as papain (to 
produce Fab fragments) or pepsin (to produce F(ab')2 fragments). Alternatively, B. burgdorferi 
10 polypeptide-binding fragments, chimeric, and humanized antibodies can be produced through the 
application of recombinant DNA technology or through synthetic chemistry using methods 
known in the art. 

Alternatively, additional antibodies capable of binding to the polypeptide antigen of the 
present invention may be produced in a two-step procedure through the use of anti-idiotypic 

15 antibodies. Such a method makes use of the fact that antibodies are themselves antigens, and 
that, therefore, it is possible to obtain an antibody which binds to a second antibody. In 
accordance with this method, B. burgdorferi polypeptide-specific antibodies are used to 
immunize an animal, preferably a mouse. The splenocytes of such an animal are then used to 
produce hybridoma cells, and the hybridoma cells are screened to identify clones which produce 

20 an antibody whose ability to bind to the B. burgdorferi polypeptide-specific antibody can be 
blocked by the B. burgdorferi polypeptide antigen. Such antibodies comprise anti-idiotypic 
antibodies to the B. burgdorferi polypeptide-specific antibody and can be used to immunize an 
animal to induce formation of further B. burgdorferi polypeptide-specific antibodies. 



25 portion of a polypeptide of the present invention recognized or specifically bound by the 
antibody. Antibody binding fragements of a polypeptide of the present invention may be 
described or specified in the same manner as for polypeptide fragements discussed above., i.e, 
by N-terminal and C-terminal positions or by size in contiguous amino acid residues. Any 
number of antibody binding fragments, of a polypeptide of the present invention, specified by N- 

30 terminal and C-terminal positions or by size in amino acid residues, as described above, may also 
be excluded from the present invention. Therefore, the present invention includes antibodies the 
specifically bind a particuarlly discribed fragement of a polypeptide of the present invention and 
allows for the exclusion of the same. 

Antibodies and fragements thereof of the present invention may also be described or specified in 
35 terms of their cross-reactivity. Antibodies and fragements that do not bind polypeptides of any 
other species of Borrelia other than B. burgdorferi are included in the present invention. 
Likewise, antibodies and fragements that bind only species of Borrelia, i.e. antibodies and 
fragements that do not bind bacteria from any genus other than Borrelia, are included in the 
present invention. 



Antibodies and fragements thereof of the present invention may be described by the 
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3. Epitope-Bearing Portions 

In another aspect, the invention provides peptides and polypeptides comprising 
epi tope-bearing portions of the B. burgdorferi polypeptides of the present invention. These 
5 epitopes are immunogenic or antigenic epitopes of the polypeptides of the present invention. An 
"immunogenic epitope" is defined as a part of a protein that elicits an antibody response when the 
whole protein or polypeptide is the immunogen. These immunogenic epitopes are believed to be 
confined to a few loci on the molecule. On the other hand, a region of a protein molecule to 
which an antibody can bind is defined as an "antigenic determinant" or "antigenic epitope." The 

10 number of immunogenic epitopes of a protein generally is less than the number of antigenic 

epitopes. See, e.g., Geysen, et al. (1983) Proc. Natl. Acad. Sci. USA 81:3998- 4002. Amino 
acid residues comprising anigenic epitopes may be determined by algorithms such as the the 
Jameson-Wolf analysis or similar algorithms or by in vivo testing for ah antigenic response using 
the methods described herein or those known in the art. 

15 As to the selection of peptides or polypeptides bearing an antigenic epitope (i.e., that 

contain a region of a protein molecule to which an antibody can bind), it is well known in that art 
that relatively short synthetic peptides that mimic part of a protein sequence are routinely capable 
of eliciting an antiserum that reacts with the partially mimicked protein. See, e.g., Sutcliffe, et 
al., (1983) Science 219:660-666. Peptides capable of eliciting protein-reactive sera are 

20 frequently represented in the primary sequence of a protein, can be characterized by a set of 
simple chemical rules, and are confined neither to immunodominant regions of intact proteins 
(i.e., immunogenic epitopes) nor to the amino or carboxyl terminals. Peptides that are extremely 
hydrophobic and those of six or fewer residues generally are ineffective at inducing antibodies^ 
that bind to the mimicked protein; longer, peptides, especially those containing proline residues, 

25 usually are effective. See, Sutcliffe, et al., supra, p. 661. For instance, 18 of 20 peptides 

designed according to these guidelines, containing 8-39 residues covering 75% of the sequence 
of the influenza virus hemagglutinin HA1 polypeptide chain, induced antibodies that reacted with 
the HA1 protein or intact virus; and 12/12 peptides from the MuLV polymerase and 18/18 from 
the rabies glycoprotein induced antibodies that precipitated the respective proteins. 

30 Antigenic epitope-bearing peptides and polypeptides of the invention are therefore useful 

to raise antibodies, including monoclonal antibodies, that bind specifically to a polypeptide of the 
invention. Thus, a high proportion of hybridomas obtained by fusion of spleen cells from 
donors immunized with an antigen epitope-bearing peptide generally secrete antibody reactive 
with the native protein. See Sutcliffe, et al., supra, p. 663. The antibodies raised by antigenic 

35 epitope-bearing peptides or polypeptides are useful to detect the mimicked protein, and antibodies 
to different peptides may be used for tracking the fate of various regions of a protein precursor 
which undergoes post-translational processing. The peptides and anti-peptide antibodies may be 
used in a variety of qualitative or quantitative assays for the mimicked protein, for instance in 
competition assays since it has been shown that even short peptides (e.g., about 9 amino acids) 
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can bind and displace the larger peptides in immunoprecipitation assays. See, e.g., Wilson, et 
ah, (1984) Cell 37:767-778. The anti-peptide antibodies of the invention also are useful for 
purification of the mimicked protein, for instance, by adsorption chromatography using methods 
known in the art. 

5 Antigenic epitope-bearing peptides and polypeptides of the invention designed according 

to the above guidelines preferably contain a sequence of at least seven, more preferably at least 
nine and most preferably between about 10 to about 50 amino acids (i.e. any integer between 7 
and 50) contained within the amino acid sequence of a polypeptide of the invention. However, 
peptides or polypeptides comprising a larger portion of an amino acid sequence of a polypeptide 

10 of the invention, containing about 50 to about 100 amino acids, or any length up to and including 
the entire amino acid sequence of a polypeptide of the invention, also are considered 
epitope-bearing peptides or polypeptides of the invention and also are useful for inducing 
antibodies that react with the mimicked protein. Preferably, the amino acid sequence of the 
epitope-bearing peptide is selected to provide substantial solubility in aqueous solvents (i.e., the 

15 sequence includes relatively hydrophilic residues and highly hydrophobic sequences are 
preferably avoided); and sequences containing proline residues are particularly preferred. 

The epitope-bearing peptides and polypeptides of the present invention may be produced 
by any conventional means for making peptides or polypeptides including recombinant means 
using nucleic acid molecules of the invention. For instance, an epitope-bearing amino acid 

20 sequence of the present invention may be fused to a larger polypeptide which acts as a carrier 
during recombinant production and purification, as well as during immunization to produce 
anti-peptide antibodies. Epitope-bearing peptides also may be synthesized using known methods 
of chemical synthesis. For instance, Houghten has described a simple method for synthesis of 
large numbers of peptides, such as 10-20 mg of 248 different 13 residue peptides representing 

25 single amino acid variants of a segment of the HA1 polypeptide which were prepared and 

characterized (by ELISA-type binding studies) in less than four weeks (Houghten, R. A. Proc. 
Natl. Acad. ScL USA 82:5131-5135 (1985)). This "Simultaneous Multiple Peptide Synthesis 
(SMPS)" process is further described in U.S. Patent No. 4,631,21 1 to Houghten fcnd coworkers 
(1986). In this procedure the individual resins for the solid-phase synthesis of various peptides 

30 are contained in separate solvent-permeable packets, enabling the optimal use of the many 
identical repetitive steps involved in solid-phase methods. A completely manual procedure 
allows 500-1000 or more syntheses to be conducted simultaneously (Houghten et al. (1985) 
Proc. Natl. Acad. Sci. 82:5131-5135 at 5134. 

Epitope-bearing peptides and polypeptides of the invention are used to induce antibodies 

35 according to methods well known in the art. See, e.g., Sutcliffe, et al., supra;; Wilson, et al., 
supra;; and Bittle, et al. (1985) J. Gen. Virol. 66:2347-2354. Generally, animals may be 
immunized with free peptide; however, anti-peptide antibody titer may be boosted by coupling of 
the peptide to a macromolecular carrier, such as keyhole limpet hemacyanin (KLH) or tetanus 
toxoid. For instance, peptides containing cysteine may be coupled to carrier using a linker such 
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as m-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS), while other peptides may be 
coupled to carrier using a more general linking agent such as glutaraldehyde. Animals such as 
rabbits, rats and mice are immunized with either free or carrier-coupled peptides, for instance, by 
intraperitoneal and/or intradermal injection of emulsions containing about 100 \ig peptide or 
5 carrier protein and Freund's adjuvant. Several booster injections may be needed, for instance, at 
intervals of about two weeks, to provide a useful titer of anti-peptide antibody which can be 
detected, for example, by ELIS A assay using free peptide adsorbed to a solid surface. The titer 
of anti-peptide antibodies in serum from an immunized animal may be increased by selection of 
anti-peptide antibodies, for instance, by adsorption to the peptide on a solid support and elution 

10 of the selected antibodies according to methods well known in the art. 

Immunogenic epitope-bearing peptides of the invention, i.e., those parts of a protein that 
elicit an antibody response when the whole protein is the immunogen, are identified according to 
methods known in the art. For instance, Gey sen, et al y supra, discloses a procedure for rapid 
concurrent synthesis on solid supports of hundreds of peptides of sufficient purity to react in an 

15 ELIS A. Interaction of synthesized peptides with antibodies is then easily detected without 

removing them from the support. In this manner a peptide bearing an immunogenic epitope of a 
desired protein may be identified routinely by one of ordinary skill in the art. For instance, the 
immunologically important epitope in the coat protein of foot-and-mouth disease virus was 
located by Gey sen etal. supra with a resolution of seven amino acids by synthesis of an 

20 overlapping set of all 208 possible hexapeptides covering the entire 213 amino acid sequence of 
the protein. Then, a complete replacement set of peptides in which all 20 amino acids were 
substituted in turn at every position within the epitope were synthesized, and the particular amino 
acids conferring specificity for the reaction with antibody were determined. Thus, peptide 
analogs of the epitope-bearing peptides of the invention can be made routinely by this method. 

25 U.S. Patent No. 4,708,781 to Geysen (1987) further describes this method of identifying a 
peptide bearing an immunogenic epitope of a desired protein. 

Further still, U.S. Patent No. 5,194,392, to Geysen (1990), describes a general method 
of detecting or determining the sequence of monomers (amino acids or other compounds) which 
is a topological equivalent of the epitope (i.e., a "mimotope") which is complementary to a 

30 particular paratope (antigen binding site) of an antibody of interest. More generally, U.S. Patent 
No. 4,433,092, also to Geysen (1989), describes a method of detecting or determining a 
sequence of monomers which is a topographical equivalent of a ligand which is complementary 
to the ligand binding site of a particular receptor of interest. Similarly, U.S. Patent No. 
5,480,971 to Houghten, R. A. etal, (1996) discloses linear C r C 7 -alkyl peralkylated 

35 oligopeptides and sets and libraries of such peptides, as well as methods for using such 

oligopeptide sets and libraries for determining the sequence of a peralkylated oligopeptide that 
preferentially binds to an acceptor molecule of interest. Thus, non-peptide analogs of the 
epitope-bearing peptides of the invention also can be made routinely by these methods. The 
entire disclosure of each document cited in this section on "Polypeptides and Fragments" is 
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hereby incorporated herein by reference. 
^ As, one of skill in the art will appreciate, the polypeptides of the present invention and the 

epitope-bearing fragments thereof described above can be combined with parts of the constant 
domain of immunoglobulins (IgG), resulting in chimeric polypeptides. These fusion proteins 
5 facilitate purification and show an increased half-life in vivo. This has been shown, e.g., for 
chimeric proteins consisting of the first two domains of the human CD4-polypeptide and various 
domains of the constant regions of the heavy or light chains of mammalian immunoglobulins. 
(EPA 0,394,827; Traunecker et al. (1988) Nature 33 1 :84-86. Fusion proteins that have a 
disulfide-linked dimeric structure due to the IgG part can also be more efficient in binding and 
10 neutralizing other molecules than a monomeric B. burgdorferi polypeptide or fragment thereof 
alone. See Fountoulakis et al. (1995) J. Biochem. 270:3958-3964. Nucleic acids encoding the 
above epitopes of B. burgdorferi polypeptides can also be recombined with a gene of interest as 
an epitope tag to aid in detection and purification of the expressed polypeptide. 

15 4. Diagnostic Assays and Kits 

The present invention further relates to methods for assaying Borrelia infection in an 
animal by detecting the expression of genes encoding Borrelia polypeptides of the present 
invention. The methods comprise analyzing tissue or body fluid from the animal for 
ZtorreZ/a-specific antibodies, nucleic acids, or proteins. Analysis of nucleic acid specific to 
20 Borrelia is assayed by PCR or hybridization techniques using nucleic acid sequences of the 
present invention as either hybridization probes or primers. See, e.g., Sambrook et al. 
Molecular cloning: A Laboratory Manual (Cold Spring Harbor Laboratory Press, 2nd ed., 1989, 
page 54 reference); Eremeeva et al. (1994) J. Clin. Microbiol. 32:803-810 (describing 
differentiation among spotted fever group Rickettsiae species by analysis of restriction fragment 
25 length polymorphism of PCR-amplified DNA) and Chen et al. 1994 J. Clin. Microbiol. 32:589- 
595 (detecting B burgdorferi nucleic acids via PCR). 

Where diagnosis of a disease state related to infection with Borrelia has already been 
made, the present invention is useful for monitoring progression or regression of the disease state 
whereby patients exhibiting enhanced Borrelia gene expression will experience a worse clinical 
30 outcome relative to patients expressing these gene(s) at a lower level. 

By "biological sample" is intended any biological sample obtained from an animal, cell 
line, tissue culture, or other source which contains Borrelia polypeptide, mRNA, or DNA. 
Biological samples include body fluids (such as saliva, blood, plasma, urine, mucus, synovial 
fluid, etc.) tissues (such as muscle, skin, and cartilage) and any other biological source suspected 
35 of containing Borrelia polypeptides or nucleic acids. Methods for obtaining biological samples 
such as tissue are well known in the art. 

The present invention is useful for detecting diseases related to Borrelia infections in 
animals. Preferred animals include monkeys, apes, cats, dogs, birds, cows, pigs, mice, horses, 
rabbits and humans. Particularly preferred are humans. 



WO 98/58943 PCT/US98/12764 

Total RNA can be isolated from a biological sample using any suitable technique such as 
the single-step guanidinium-thiocyanate-phenol-chloroform method described in Chomczynski et 
. al. (1987) Anal. Biochem. 162:156-159. mRNA encoding Borrelia polypeptides having 
sufficient homology to the nucleic acid sequences identified in SEQ ID NOS: 1-155 to allow for 
5 hybridization between complementary sequences are then assayed using any appropriate method. 
These include Northern blot analysis, SI nuclease mapping, the polymerase chain reaction 
(PCR), reverse transcription in combination with the polymerase chain reaction (RT-PCR), and 
reverse transcription in combination with the ligase chain reaction (RT-LCR). 

Northern blot analysis can be performed as described in Harada et al. (1990) Cell 

10 63:303-312. Briefly, total RNA is prepared from a biological sample as described above. For 
the Northern blot, the RNA is denatured in an appropriate buffer (such as glyoxal/dimethyl 
sulfoxide/sodium phosphate buffer), subjected to agarose gel electrophoresis, and transferred 
onto a nitrocellulose filter. After the RNAs have been linked to the filter by a UV linker, the filter 
is prehybridized in a solution containing formamide, SSC, Denhardt's solution, denatured 

15 salmon sperm, SDS, and sodium phosphate buffer. A B. burgdorferi polynucleotide sequence 
shown in SEQ ID NOS: 1-155 labeled according to any appropriate method (such as the 
32 P-multiprimed DN A labeling system (Amersham)) is used as probe. After hybridization 
overnight, the filter is washed and exposed to x-ray film. DNA for use as probe according to the 
present invention is described in the sections above and will preferably at least 15 nucleotides in 

20 length. 

SI mapping can be performed as described in Fujita et al. (1987) Cell 49:357-367. To 
prepare probe DNA for use in SI mapping, the sense strand of an above-described B. 
burgdorferi DNA sequence of the present invention is used as a template to synthesize labeled 
antisense DNA. The antisense DNA can then be digested using an appropriate restriction 

25 endonuclease to generate further DNA probes of a desired length. Such antisense probes are 

useful for visualizing protected bands corresponding to the target mRNA {i.e., mRNA encoding 
Borrelia polypeptides). 

Levels of mRNA encoding Borrelia polypeptides are assayed, for e.g. , using the 
RT-PCR method described in Makino et al. (1990) Technique 2:295-301. By this method, the 

30 radioactivities of the "amplicons" in the polyacrylamide gel bands are linearly related to the initial 
concentration of the target mRNA. Briefly, this method involves adding total RNA isolated from 
a biological sample in a reaction mixture containing a RT primer and appropriate buffer. After 
incubating for primer annealing, the mixture can be supplemented with a RT buffer, dNTPs, 
DTT, RNase inhibitor and reverse transcriptase. After incubation to achieve reverse transcription 

35 of the RNA, the RT products are then subject to PCR using labeled primers. Alternatively, rather 
than labeling the primers, a labeled dNTP can be included in the PCR reaction mixture. PCR 
amplification can be performed in a DNA thermal cycler according to conventional techniques. 
After a suitable number of rounds to achieve amplification, the PCR reaction mixture is 
electrophoresed on a polyacrylamide gel. After drying the gel, the radioactivity of the appropriate 
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bands (corresponding to the mRNA encoding the Borrelia polypeptides of the present invention) 
are quantified using an imaging analyzer. RT and PCR reaction ingredients and conditions, 
reagent and gel concentrations, and labeling methods are well known in the art. Variations on the 
RT-PCR method will be apparent to the skilled artisan. Other PCR methods that can detect the 
5 nucleic acid of the present invention can be found in PCR PRIMER: A LABORATORY 
MANUAL (C.W. Dieffenbach et al. eds., Cold Spring Harbor Lab Press, 1995). 

The polynucleotides of the present invention, including both DN A and RN A, may be 
used to detect polynucleotides of the present invention or Borrelia species including B. 
. burgdorferi using bio chip technology. The present invention includes both high density chip 

10 arrays (>1000 oligonucleotides per cm 2 ) and low density chip arrays (<1000 oligonucleotides per 
cm 2 ). Bio chips comprising arrays of polynucleotides of the present invention may be used to 
detect Borrelia species, including B. burgdorferi, in biological and environmental samples and to 
diagnose an animal, including humans, with an B. burgdorferi or other Borrelia infection. The 
bio chips of the present invention may comprise polynucleotide sequences of other pathogens 

15 including bacteria, viral, parasitic, and fungal polynucleotide sequences, in addition to the 
polynucleotide sequences of the present invention, for use in rapid diffenertial pathogenic 
detection and diagnosis. The bio chips can also be used to monitor an B. burgdorferi or other 
Borrelia infections and to monitor the genetic changes (deletions, insertions, mismatches, etc.) in 
response to drug therapy in the clinic and drug development in the laboratory. The bio chip 

20 technology comprising arrays of polynucleotides of the present invention may also be used to 
simultaneously monitor the expression of a multiplicity of genes, including those of the present 
invention. The polynucleotides used to comprise a selected array may be specified in the same 
manner as for the fragements, i.e, by their 5' and 3' positions or length in contigious base pairs 
and include from. Methods and particular uses of the polynucleotides of the present invention to 

25 detect Borrelia species, including B. burgdorferi, using bio chip technology include those known 
in the art and those of: U.S. Patent Nos. 5510270, 5545531, 5445934, 5677195, 5532128, 
5556752, 5527681, 5451683, 5424186, 5607646, 5658732 and World Patent Nos. 
WO/9710365, WO/951 1995, WO/9743447, WO/9535505, each incorporated herein in their 
entireties. 

30 Biosensors using the polynucleotides of the present invention may also be used to detect, 

diagnose, and monitor B. burgdorferi or other Borrelia species and infections thereof. 
Biosensors using the polynucleotides of the present invention may also be used to detect 
particular polynucleotides of the present invention. Biosensors using the polynucleotides of the 
present invention may also be used to monitor the genetic changes (deletions, insertions, 

35 mismatches, etc.) in response to drug therapy in the clinic and drug development in the 

laboratory. Methods and particular uses of the polynucleotides of the present invention to detect 
Borrelia species, including B. burgdorferi, using biosenors include those known in the art and 
those of: U.S. Patent Nos 5721 102, 5658732, 5631 170, and World Patent Nos. WO97/3501 1, 
WO/9720203, each incorporated herein in their entireties. 
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Thus, the present invention includes both bio chips and biosensors comprising 
polynucleotides of the present invention and methods of their use. 

Assaying Borrelia polypeptide levels in a biological sample can occur using any 
art-known method, such as antibody-based techniques. For example, Borrelia polypeptide 
5 expression in tissues can be studied with classical immunohistological methods. In these, the 
specific recognition is provided by the primary antibody (polyclonal or monoclonal) but the 
secondary detection system can utilize fluorescent, enzyme, or other conjugated secondary 
antibodies. As a result, an immunohistological staining of tissue section for pathological 
examination is obtained. Tissues can also be extracted, e.g., with urea and neutral detergent, for 

10 the liberation of Borrelia polypeptides for Western-blot or dot/slot assay. See, e.g., Jalkanen, 
M. et al. (1985) J. Cell. Biol. 101:976-985; Jalkanen, M. et al. (1987) J. Cell . Biol. 
105:3087-3096. In this technique, which is based on the use of cationic solid phases, 
quantitation of a Borrelia polypeptide can be accomplished using an isolated Borrelia polypeptide 
as a standard. This technique can also be applied to body fluids. 

15 Other antibody-based methods useful for detecting Borrelia polypeptide gene expression 

include immunoassays, such as the ELISA and the radioimmunoassay (RIA). For example, a 
Borrelia polypeptide-specific monoclonal antibodies can be used both as an immunoabsorbent 
and as an enzyme-labeled probe to detect and quantify a Borrelia polypeptide. The amount of a 
Borrelia polypeptide present in the sample can be calculated by reference to the amount present in 

20 a standard preparation using a linear regression computer algorithm. Such an ELISA is described 
in Iacobelli et al. (1988) Breast Cancer Research and Treatment 1 1: 19-30. In another ELISA 
assay, two distinct specific monoclonal antibodies can be used to detect Borrelia polypeptides in a 
body fluid. In this assay, one of the antibodies is used as the immunoabsorbent and the other as 
the enzyme-labeled probe. 

25 The above techniques may be conducted essentially as a "one-step" or "two-step" assay. 

The "one-step" assay involves contacting the Borrelia polypeptide with immobilized antibody 
. and, without washing, contacting the mixture with the labeled antibody. The "two-step" assay 
involves washing before contacting the mixture with the labeled antibody. Other conventional 
methods may also be employed as suitable. It is usually desirable to immobilize one component 

30 of the assay system on a support, thereby allowing other components of the system to be brought 
into contact with the component and readily removed from the sample. Variations of the above 
and other immunological methods included in the present invention can also be found in Harlow 
et al., ANTIBODIES: A LABORATORY MANUAL, (Cold Spring Harbor Laboratory Press, 
2nd ed. 1988). 

35 Suitable enzyme labels include, for example, those from the oxidase group, which 

catalyze the production of hydrogen peroxide by reacting with substrate. Glucose oxidase is 
particularly preferred as it has good stability and its substrate (glucose) is readily available. 
Activity of an oxidase label may be assayed by measuring the concentration of hydrogen peroxide 
formed by the enzyme-labeled antibody/substrate reaction. Besides enzymes, other suitable 
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labels include radioisotopes, such as iodine ( l25 I, ,2, I), carbon ( 14 C), sulphur ( 35 S), tritium ( 3 H), 
indium ( ll2 In), and technetium (""Tc), and fluorescent labels, such as fluorescein and 
rhodamine, and biotin. 

Further suitable labels for the Bdrrelia polypeptide-specific antibodies of the present 
5 invention are provided below. Examples of suitable enzyme labels include malate 

dehydrogenase, Borrelia nuclease, delta-5-steroid isomerase, yeast-alcohol dehydrogenase, 
alpha-glycerol phosphate dehydrogenase, triose phosphate isomerase, peroxidase, alkaline 
phosphatase, asparaginase, glucose oxidase, beta-galactosidase, ribonuclease, urease, catalase, 
glucose-6-phosphate dehydrogenase, glucoamylase, and acetylcholine esterase. 

10 Examples of suitable radioisotopic labels include 3 H, n, In, 125 I, ,3, I, 32 P, 35 S, ,4 C, 5I Cr, 

57 To, 58 Co, 59 Fe, 75 Se, ,52 Eu, 90 Y, 67 Cu, 2,7 Ci, 21 'At, 2,2 Pb, 47 Sc, ,09 Pd, etc. n, In is a preferred 
isotope where in vivo imaging is used since.its avoids the problem of dehalogenation of the ,25 I 
or 13l I-labeled monoclonal antibody by the liver. In addition, this radionucleotide has a more 
favorable gamma emission energy for imaging. See, e.g., Perkins et al. (1985) Eur. J. Nucl. 

15 Med. 10:296-301; Carasquillo et al. (1987) J. Nucl. Med. 28:281-287. For example, m In 

coupled to monoclonal antibodies with l-(P-isothiocyanatobenzyl)-DPTA has shown little uptake 
in non-tumors tissues, particularly the liver, and therefore enhances specificity of tumor 
localization. See, Esteban et al. (1987) J. Nucl. Med. 28:861-870. 

Examples of suitable non-radioactive isotopic labels include ,57 Gd, 55 Mn, ,62 Dy, 52 Tr, 

20 and 56 Fe. 

Examples of suitable fluorescent labels include an 152 Eu label, a fluorescein label, an 
isothiocyanate label, a rhodamine label, a phycoerythrin label, a phycocyanin label, an 
allophycocyanin label, an o-phthaldehyde label, and a fluorescamine label. 

Examples of suitable toxin labels include, Pseudomonas toxin, diphtheria toxin, ricin, 
25 and cholera toxin. 

Examples of chemiluminescent labels include a luminal label, an isoluminal label, an 
aromatic acridinium ester label, an imidazole label, an acridinium salt label, an oxalate ester label, 
a luciferin label, a luciferase label, and an aequorin label. 

Examples of nuclear magnetic resonance contrasting agents include heavy metal nuclei 
30 such as Gd, Mn, and iron. 

Typical techniques for binding the above-described labels to antibodies are provided by 
Kennedy et al."(1976) Clin. Chim. Acta 70:1-31, and Schurs et al. (1977) Clin. Chim. Acta 
81:1-40. Coupling techniques mentioned in the latter are the glutaraldehyde method, the 
periodate method, the dimaleimide method, the m-maleimidobenzyl-N-hydroxy-succinimide ester 
35 method, all of which methods are incorporated by reference herein. 

In a related aspect, the invention includes a diagnostic kit for use in screening serum 
containing antibodies specific against B. burgdorferi infection. Such a kit may include an 
isolated B. burgdorferi antigen comprising an epitope which is specifically immunoreactive with 
at least one anti-5. burgdorferi antibody. Such a kit also includes means for detecting the 
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binding of said antibody to the antigen. In specific embodiments, the kit may include a 
recombinantly produced or chemically synthesized peptide or polypeptide antigen. The peptide 
or polypeptide antigen may be attached to a solid support. 

In a more specific embodiment, the detecting means of the above-described kit includes a 
5 solid support to which said peptide or polypeptide antigen is attached. Such a kit may also 

include a non-attached reporter-labeled anti-human antibody. In this embodiment, binding of the 
antibody to the B. burgdorferi antigen can be detected by binding of the reporter labeled antibody 
to the anti-B. burgdorferi polypeptide antibody. 

In a related aspect, the invention includes a method of detecting B. burgdorferi infection 

10 in a subject. This detection method includes reacting a body fluid, preferably serum, from the 
subject with an isolated B. burgdorferi antigen, and examining the antigen for the presence of 
bound antibody. In a specific embodiment, the method includes a polypeptide antigen attached to 
a solid support, and serum is reacted with the support. Subsequently, the support is reacted with 
a reporter-labeled anti-human antibody. The support is then examined for the presence of 

15 reporter-labeled antibody. 

The solid surface reagent employed in the above assays and kits is prepared by known 
techniques for attaching protein material to solid support material, such as polymeric beads, dip 
sticks, 96-well plates or filter material. These attachment methods generally include non-specific 
adsorption of the protein to the support or covalent attachment of the protein , typically through a 

20 free amine group, to a chemically reactive group on the solid support, such as an activated 

carboxyl, hydroxyl, or aldehyde group. Alternatively, streptavidin coated plates can be used in 
conjunction with biotinylated antigen(s). 

The polypeptides and antibodies of the present invention, including fragments thereof, 
may be used to detect Borrelia species including B. burgdorferi using bio chip and biosensor 

25 technology. Bio chip and biosensors of the present invention may comprise the polypeptides of 
the present invention to detect antibodies, which specifically recognize Borrelia species, including 
B. burgdorferi. Bio chip and biosensors of the present invention may also comprise antibodies 
which specifically recognize the polypeptides of the present invention to detect Borrelia species, 
including B. burgdorferi or specific polypeptides of the present invention. Bio chips or 

30 biosensors comprising polypeptides or antibodies of the present invention may be used to detect 
Borrelia species, including B. burgdorferi, in biological and environmental samples and to 
diagnose an animal, including humans, with an B. burgdorferi or other Borrelia infection. Thus, 
the present invention includes both bio chips and biosensors comprising polypeptides or 
antibodies of the present invention and methods of their use. 

35 The bio chips of the present invention may further comprise polypeptide sequences of 

other pathogens including bacteria, viral, parasitic, and fungal polypeptide sequences, in addition 
to the polypeptide sequences of the present invention, for use in rapid diffenertial pathogenic 
detection and diagnosis. The bio chips of the present invention may further comprise antibodies 
or fragements thereof specific for other pathogens including bacteria, viral, parasitic, and fungal 



WO 98/58943 PCT/US98/12764 

polypeptide sequences, in addition to the antibodies or fragements thereof of the present 
invention, for use in rapid diffenertial pathogenic detection and diagnosis. The bio chips and 
biosensors of the present invention may also be used to monitor an B. burgdorferi or other 
Borrelia infection and to monitor the genetic changes (amio acid deletions, insertions, 
5 substitutions, etc.) in response to drug therapy in the clinic and drug development in the 

laboratory. The bio chip and biosensors comprising polypeptides or antibodies of the present 
invention may also be used to simultaneously monitor the expression of a multiplicity of 
' polypeptides, including those of the present invention. The polypeptides used to comprise a bio 
chip or biosensor of the present invention may be specified in the same manner as for the 

10 fragements, i.e, by their N-terminal and C-terminal positions or length in contigious amino acid 
residue. Methods and particular uses of the polypeptides and antibodies of the present invention 
to detect Borrelia species, including B. burgdorferi, or specific polypeptides using bio chip and 
biosensor technology include those known in the art, those of the U.S. Patent Nos. and World 
Patent Nos. listed above for bio chips and biosensors using polynucleotides of the present 

15 invention, and those of: U.S. Patent Nos. 5658732, 5135852, 5567301, 5677196, 5690894 
and World Patent Nos. W09729366, W096 12957, each incorporated herein in their entireties. 

5. Screening Assay for Binding Agents 

Using the isolated proteins of the present invention, the present invention further provides 
20 methods of obtaining and identifying agents which bind to a protein encoded by one of the ORFs 
of the present invention or to one of the fragments and the Borrelia burgdorferi fragment and 
contigs herein described. 

In general, such methods comprise steps of: 

(a) contacting an agent with an isolated protein encoded by one of the ORFs of the 
25 present invention, or an isolated fragment of the Borrelia burgdorferi genome; and 

(b) determining whether the agent binds to said protein or said fragment. 
The agents screened in the above assay can be, but are not limited to, peptides, 

carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

30 For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. 

Alternatively, agents may be rationally selected or designed. As used herein, an agent is 
said to be "rationally selected or designed" when the agent is chosen based on the configuration 

35 of the particular protein. For example, one skilled in the art can readily adapt currently available 
procedures to generate peptides, pharmaceutical agents and the like capable of binding to a 
specific peptide sequence in order to generate rationally designed antipeptide peptides, for 
example see Hurby et al y "Application of Synthetic Peptides: Antisense Peptides," in Synthetic 
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Peptides, A User's Guide, W. H. Freeman, NY (1992), pp. 289-307, and Kaspczak et aL % 
Biochemistry 25:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 
5 of the present invention. As described above, such agents can be randomly screened or rationally 
designed/selected. Targeting the ORF or EMF allows a skilled artisan to design sequence 
specific or element specific agents, modulating the expression of either a single ORF or multiple 
ORFs which rely on the same EMF for expression control. 

One class of DNA binding agents are agents which contain base residues which hybridize 

10 or form a triple helix by binding to DNA or RNA. Such agents can be based on the classic 
phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric 
derivatives which have base attachment capacity. 

Agents suitable for use in these methods usually contain 20 to 40 bases and are designed 
to be complementary to a region of the gene involved in transcription (triple helix - see Lee et aL 9 

15 Nucl Acids Res. 6:3073 (1979); Cooney et al y Science 241:456 (1988); and Dervan et al. y 
Science 257:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix- formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 

20 polypeptide. Both techniques have been demonstrated to be effective in model systems. 

Information contained in the sequences of the present invention can be used to design antisense 
and triple helix-forming oligonucleotides, and other DNA binding agents. 

6. Pharmaceutical Compositions and Vaccines 

25 The present invention further provides pharmaceutical agents which can be used to 

modulate the growth or pathogenicity of Borrelia burgdorferi, or another related organism, in 
vivo or in vitro. As used herein, a "pharmaceutical agent" is defined as a composition of matter 
which can be formulated using known techniques to provide a pharmaceutical compositions. As 
used herein, the "pharmaceutical agents of the present invention" refers the pharmaceutical agents 

30 which are derived from the proteins encoded by the ORFs of the present invention or are agents 
which are identified using the herein described assays. 

As used herein, a pharmaceutical agent is said to "modulate the growth pathogenicity of 
Borrelia burgdorferi or a related organism, in vivo or in vitro" when the agent reduces the rate of 
growth, rate of division, or viability of the organism in question. The pharmaceutical agents of 

35 the present invention can modulate the growth or pathogenicity of an organism in many fashions, 
although an understanding of the underlying mechanism of action is not needed to practice the 
use of the pharmaceutical agents of the present invention. Some agents will modulate the growth 
by binding to an important protein thus blocking the biological activity of the protein, while other 
agents may bind to a component of the outer surface of the organism blocking attachment or 
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rendering the organism more prone to act the bodies nature immune system. Alternatively, the 
agent may comprise a protein encoded by one of the ORFs of the present invention and serve as a 
vaccine. The development and use of a vaccine based on outer membrane components are well 
known in the art. 

5 As used herein, a "related organism" is a broad term which refers to any organism whose 

growth can be modulated by one of the pharmaceutical agents of the present invention. In 
general, such an organism will contain a homolog of the protein which is the target of the 
pharmaceutical agent or the protein used as a vaccine. As such, related organisms do not need to 
be bacterial but may be fungal or viral pathogens. 

10 The pharmaceutical agents and compositions of the present invention may be administered 

in a convenient manner, such as by the oral, topical, intravenous, intraperitoneal, intramuscular, 
subcutaneous, intranasal or intradermal routes. The pharmaceutical compositions are 
administered in an amount which is effective for treating and/or prophylaxis of the specific 
indication. In general, they are administered in an amount of at least about 1 mg/kg body weight 

15 and in most cases they will be administered in an amount not in excess of about 1 g/kg body 
weight per day. In most cases, the dosage is from about 0. 1 mg/kg to about 10 g/kg body 
weight daily, taking into account the routes of administration, symptoms, etc. 

The agents of the present invention can be used in native form or can be modified to form 
a chemical derivative. As used herein, a molecule is said to be a "chemical derivative" of another 

20 molecule when it contains additional chemical moieties not normally a part of the molecule. Such 
moieties may improve the molecule's solubility, absorption, biological half life, etc. The 
moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any 
undesirable side effect of the molecule, etc. Moieties capable of mediating such effects are 
disclosed in, among other sources, REMINGTON'S PHARMACEUTICAL SCIENCES (1980) : 

25 cited elsewhere herein. 

For example, such moieties may change an immunological character of the functional 
derivative, such as affinity for a given antibody. Such changes in immunomodulation activity are 
measured by the appropriate assay, such as a competitive type immunoassay. Modifications of 
such protein properties as redox or thermal stability, biological half-life, hydrophobicity, 

30 susceptibility to proteolytic degradation or the tendency to aggregate with carriers or into 

multimers also may be effected in this way and can be assayed by methods well known to the 
skilled artisan. 

The therapeutic effects of the agents of the present invention may be obtained by 
providing the agent to a patient by any suitable means {e.g., inhalation, intravenously, 
35 intramuscularly, subcutaneously, enterally, or parenterally). It is preferred to administer the 
agent of the present invention so as to achieve an effective concentration within the blood or 
tissue in which the growth of the organism is to be controlled. To achieve an effective blood 
concentration, the preferred method is to administer the agent by injection. The administration 
may be by continuous infusion, or by single or multiple injections. 
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In providing a patient with one of the agents of the present invention, the dosage of the 
administered agent will vary depending upon such factors as the patient's age, weight, height, 
sex, general medical condition, previous medical history, etc. In general, it is desirable to 
provide the recipient with a dosage of agent which is in the range of from about 1 pg/kg to 10 
mg/kg (body weight of patient), although a lower or higher dosage may be administered. The 
therapeutically effective dose can be lowered by using combinations of the agents of the present 
invention or another agent. 

As used herein, two or more compounds or agents are said to be administered "in 
combination" with each other when either (1) the physiological effects of each compound, or (2) 
the serum concentrations of each compound can be measured at the same time. The composition 
of the present invention can be administered concurrently with, prior to, or following the 
administration of the other agent. 

The agents of the present invention are intended to be provided to recipient subjects in an 
amount sufficient to decrease the rate of growth (as defined above) of the target organism. 

The administration of the agent(s) of the invention may be for either a "prophylactic" or 
"therapeutic" purpose. When provided prophylactically, the agent(s) are provided in advance of 
any symptoms indicative of the organisms growth. The prophylactic administration of the 
agent(s) serves to prevent, attenuate, or decrease the rate of onset of any subsequent infection. 
When provided therapeutically, the agent(s) are provided at (or shortly after) the onset of an 
indication of infection. The therapeutic administration of the compound(s) serves to attenuate the 
pathological symptoms of the infection and to increase the rate of recovery. 

The agents of the present invention are administered to a subject, such as a mammal, or a 
patient, in a pharmaceutically acceptable form and in a therapeutically effective concentration. A 
composition is said to be "pharmacologically acceptable" if its administration can be tolerated by a 
recipient patient. Such an agent is said to be administered in a "therapeutically effective amount" 
if the amount administered is physiologically significant. An agent is physiologically significant 
if its presence results in a detectable change in the physiology of a recipient patient. 

The agents of the present invention can be formulated according to known methods to 
prepare pharmaceutically useful compositions, whereby these materials, or their functional 
derivatives, are combined in a mixture with a pharmaceutically acceptable carrier vehicle. 
Suitable vehicles and their formulation, inclusive of other human proteins, e.g., human serum 
albumin, are described, for example, in REMINGTON'S PHARMACEUTICAL SCIENCES, 
16th Ed., Osol, A., Ed., Mack Publishing, Easton PA (1980). In order to form a 
pharmaceutically acceptable composition suitable for effective administration, such compositions 
will contain an effective amount of one or more of the agents of the present invention, together 
with a suitable amount of carrier vehicle. 

Additional pharmaceutical methods may be employed to control the duration of action. 
Control release preparations may be achieved through the use of polymers to complex or absorb 
one or more of the agents of the present invention. The controlled delivery may be effectuated by 
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a variety of well known techniques, including formulation with macromolecules such as, for 
example, polyesters, polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, 
methylcellulose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the 
macromolecules and the agent in the formulation, and by appropriate use of methods of 
5 incorporation, which can be~ manipulated to effectuate a desired time course of release. Another 
possible method to control the duration of action by controlled release preparations is to 
incorporate agents of the present invention into particles of a polymeric material such as 
polyesters, polyamino acids, hydrogels, poly (lactic acid) or ethylene vinylacetate copolymers. 
Alternatively, instead of incorporating these agents into polymeric particles, it is possible to 

10 entrap these materials in microcapsules prepared, for example, by coacervation techniques or by 
interfacial polymerization with, for example, hydroxymethylcellulose or gelatine-microcapsules 
and poly(methylmethacylate) microcapsules, respectively, or in colloidal drug delivery systems, 
for example, liposomes, albumin microspheres, microemulsions, nanoparticles, and 
nanocapsules or in macroemulsions. Such techniques are disclosed in REMINGTON'S 

15 PHARMACEUTICAL SCIENCES (1980). 

The invention further provides a pharmaceutical pack or kit comprising one or more 
containers filled with one or more of the ingredients of the pharmaceutical compositions of the 
invention. Associated with such container(s) can be a notice in the form prescribed by a 
governmental agency regulating the manufacture, use or sale of pharmaceuticals or biological 

20 products, which notice reflects approval by the agency of manufacture, use or sale for human 
administration. 

In addition, the agents of the present invention may be employed in conjunction with 
other therapeutic compounds. 

25 7. Shot-Gun Approach to Megabase DNA Sequencing 

The present invention further demonstrates that a large sequence can be sequenced using a 
random shotgun approach. This procedure, described in detail in the examples that follow, has 
eliminated the up front cost of isolating and ordering overlapping or contiguous subclones prior 
to the stan of the sequencing protocols. 
30 Certain aspects of the present invention are described in greater detail in the examples that 

follow. The examples are provided by way of illustration. Other aspects and embodiments of 
the present invention are contemplated by the inventors, as will be clear to those of skill in the art 
from reading the present disclosure. 
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LIBRARIES AND SEQUENCING 
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1. Shotgun Sequencing Probability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing follows from 
the Lander and Waterman (Landerman and Waterman, Genomics 2;231 (1988)) application of the 
equation for the Poisson distribution. According to this treatment, the probability, PO, that any 
5 given base in a sequence of size L, in nucleotides, is not sequenced after a certain amount, n, in 
nucleotides, of random sequence has been determined can be calculated by the equation PO = e- 
m, where m is L/n, the fold coverage. For instance, for a genome of 2.8 Mb, m=l when 2.8 
Mb of sequence has been randomly generated (IX coverage). At that point, PO = e-1 = 0.37. 
The probability that any given base has not been sequenced is the same as the probability that any 

10 region of the whole sequence L has not been determined and, therefore, is equivalent to the 
fraction of the whole sequence that has yet to be determined. Thus, at one-fold coverage, 
approximately 37% of a polynucleotide of size L, in nucleotides has not been sequenced. When 
14 Mb of sequence has been generated, coverage is 5X for a 2.8 Mb and the unsequenced 
fraction drops to .0067 or 0.67%. 5X coverage of a 2.8 Mb sequence can be attained by 

15 sequencing approximately 17,000 random clones from both insert ends with an average sequence 
read length of 410 bp. 

Similarly, the total gap length, G, is determined by the equation G = Le-m, and the . 
average gap size, g, follows the equation, g = L/n. Thus, 5X coverage leaves about 240 gaps 
averaging about 82 bp in size in a sequence of a polynucleotide 2.8 Mb long. 

20 The treatment above is essentially that of Lander and Waterman, Genomics!;. 23 1 

(1988). 

2. Random Library Construction 

In order to approximate the random model described above during actual sequencing, a 
25 nearly ideal library of cloned genomic fragments is required. The following library construction 
procedure was developed to achieve this end. 

Borrelia burgdorferi DNA is prepared by phenol extraction. A mixture containing 200 [ig 
DNA in 1.0 ml of 300 mM sodium acetate, 10 mM Tris-HCl, 1 mM Na-EDTA, 50% glycerol is 
processed through a nebulizer (IPI Medical Products) with a stream of nitrogen adjusted to 35 
30 Kpa for 2 minutes. The sonicated DNA is ethanol precipitated and redissolved in 500 |J,1 TE 
buffer. 

To create blunt-ends, a 100 |ii aliquot of the resuspended DNA is digested with 5 units of 
BAL31 nuclease (New England BioLabs) for 10 min at 30°C in 200 jlxI BAL31 buffer. The 
digested DNA is phenol -extracted, ethanol-precipitated, redissolved in 100 jil TE buffer, and 
35 then size-fractionated by electrophoresis through a 1.0% low melting temperature agarose gel. 
The section containing DNA fragments 1.6-2.0 kb in size is excised from the gel, and the LGT 
agarose is melted and the resulting solution is extracted with phenol to separate the agarose from 
the DNA. DNA is ethanol precipitated and redissolved in 20 jil of TE buffer for ligation to 
vector. 
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A two-step ligation procedure is used to produce a plasmid library with 97% inserts, of 
which >99% were single inserts. The first ligation mixture (50 ul) contains 2 |Jtg of DNA 
fragments, 2 |Xg pUC18 DNA (Pharmacia) cut with Smal and dephosphorylated with bacterial 
alkaline phosphatase, and 10 units of T4 ligase (GIBCO/BRL) and is incubated at 14°C for 4 hr. 
5 The ligation mixture then is phenol extracted and ethanol precipitated, and the precipitated DNA is 
dissolved in 20 |il TE buffer and electrophoresed on a 1.0% low melting agarose gel. Discrete 
bands in a ladder are visualized by ethidium bromide-staining and UV illumination and identified 
by size as insert (I), vector (v), v+I, v+2i, v+3i, etc. The portion of the gel containing v+ 1 DNA 
is excised and the v+I DNA is recovered and resuspended into 20 \l\ TE. The v+I DNA then is 

10 blunt-ended by T4 polymerase treatment for 5 min. at 37°C in a reaction mixture (50 ul) 

containing the v+I linears, 500 each of the 4 dNTPs, and 9 units of T4 polymerase (New 
England BioLabs), under recommended buffer conditions. After phenol extraction and ethanol 
precipitation the repaired v+I linears are dissolved in 20 |il TE. The final ligation to produce 
circles is carried out in a 50 |il reaction containing 5 |xl of v+I linears and 5 units of T4 ligase at 

15 14°C overnight. After 10 min. at 70°C the following day, the reaction mixture is stored at -20°C. 
This two-stage procedure results in a molecularly random collection of single-insert 
plasmid recombinants with minimal contamination from double-insert chimeras (<1 %) or free 
vector (<3%). 

Since deviation from randomness can arise from propagation the DNA in the host, E. coli 

20 host cells deficient in all recombination and restriction functions (A. Greener, Strategies 3 (1 ):5 
(1990)) are used to prevent rearrangements, deletions, and loss of clones by restriction. 
Furthermore, transformed cells are plated directly on antibiotic diffusion plates to avoid the usual 
broth recovery phase which allows multiplication and selection of the most rapidly growing cells. 
Plating is carried out as follows. A 100 ^il aliquot of Epicurian Coli SURE II 

25 Supercompetent Cells (Stratagene 200152) is thawed on ice and transferred to a chilled Falcon 

2059 tube on ice. A 1.7 |xl aliquot of 1.42 M beta-mercaptoethanol is added to the aliquot of cells 
to a final concentration of 25 mM. Cells are incubated on ice for 1Q min. A 1 |il aliquot of the 
final ligation is added to the cells and incubated on ice for 30 min. The cells are heat pulsed for 
30 sec. at 42°C and placed back on ice for 2 min. The outgrowth period in liquid culture is 

30 eliminated from this protocol in order to minimize the preferential growth of any given 

transformed cell. Instead the transformation mixture is plated directly on a nutrient rich SOB 
plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 20 g tryptone, 5 g yeast extract, 
0.5 g NaCl, 1.5% Difco Agar per liter of media). The 5 ml bottom layer is supplemented with 
0.4 ml of 50 mg/ml ampicillin per 100 ml SOB agar. The 15 ml top layer of SOB agar is 

35 supplemented with 1 ml X-Gal (2%), 1 ml MgC12 (1 M), and 1 ml MgSO4/100 ml SOB agar. 
The 15 ml top layer is poured just prior to plating. Our titer is approximately 100 colonies/10 |jj 
aliquot of transformation. 
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All colonies are picked for template preparation regardless of size. Thus, only clones lost 
due to "poison" DN A or deleterious gene products are deleted from the library , resulting in a 
slight increase in gap number over that expected. 

5 3. Random DNA Sequencing 

High quality double stranded DNA plasmid templates are prepared using a "boiling bead" 
method developed in collaboration with Advanced Genetic Technology Corp. (Gaithersburg, 
MD) (Adams et a/., Science 252: 1651 (1991); Adams et aL, Nature 355:632 (1992)). Plasmid 
preparation is performed in a 96-well format for all stages of DNA preparation from bacterial 

10 growth through final DNA purification. Template concentration is determined using Hoechst 

Dye and a Millipore Cytofluor. DNA concentrations are not adjusted, but low-yielding templates 
are identified where possible and not sequenced. 

Templates are also prepared from two Borrelia burgdorferi lambda genomic libraries. An 
amplified library is constructed in the vector Lambda GEM- 12 (Promega) and an unamplified 

15 - library is constructed in Lambda DASH II (Stratagene). In particular, for the unamplified lambda 
library, Borrelia burgdorferi DNA (> 100 kb) is partially digested in a reaction mixture (200 ul) 
containing 50 |ag DNA, IX Sau3AI buffer, 20 units Sau3AI for 6 min. at 23°C. The digested 
DNA was phenol-extracted and electrophoresed on a 0.5% low melting agarose gel at 2V/cm for 
7 hours. Fragments from 1 5 to 25 kb are excised and recovered in a final volume of 6 ul. One 

20 \il of fragments is used with 1 (ll of DASHII vector (Stratagene) in the recommended ligation 
reaction. One |nJ of the ligation mixture is used per packaging reaction following the 
recommended protocol with the Gigapack II XL Packaging Extract (Stratagene, #22771 1). 
Phage are plated directly without amplification from the packaging mixture (after dilution with 
500 jxl of recommended SM buffer and chloroform treatment). Yield is about 2.5x103 pfu/ul. 

25 The amplified library is prepared essentially as above except the lambda GEM- 12 vector is used. 
After packaging, about 3.5x104 pfu are plated on the restrictive NM539 host. The lysate is 
harvested in 2 ml of SM buffer and stored frozen in 7% dimethylsulfoxide. The phage titer is 
approximately 1x109 pfu/ml. 

Liquid ly sates (100 |xl) are prepared from randomly selected plaques (from the 

30 unamplified library) and template is prepared by long-range PCR using T7 and T3 vector-specific 
primers. 

Sequencing reactions are carried out on plasmid and/or PCR templates using the AB 
Catalyst LabStation with Applied Biosystems PRISM Ready Reaction Dye Primer Cycle 
Sequencing Kits for the M13 forward (M13-21) and the M13 reverse (M13RP1) primers (Adams 
35 et al, Nature 368:414 (1994)). Dye terminator sequencing reactions are carried out on the 

lambda templates on a Perkin-Elmer 9600 Thermocycler using the Applied Biosystems Ready 
Reaction Dye Terminator Cycle Sequencing kits. T7 and SP6 primers are used to sequence the 
ends of the inserts from the Lambda GEM- 12 library and T7 and T3 primers are used to sequence 
the ends of the inserts from the Lambda DASH II library. Sequencing reactions are performed 
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by eight individuals using an average of fourteen AB 373 DNA Sequencers per day. All 
sequencing reactions are analyzed using the Stretch modification of the AB 373, primarily using a 
34 cm well-to-read distance. The overall sequencing success rate very approximately is about 
85% for M13-21 and M13RP1 sequences and 65% for dye-terminator reactions. The average 
5 usable read length is 485 bp for M13-21 sequences, 445bp for M13RP1 sequences, and 375 bp 
for dye-terminator reactions. 

Richards et aL, Chapter 28 in AUTOMATED DNA SEQUENCING AND ANALYSIS, 
M. D. Adams, C. Fields, J. C. Venter, Eds., Academic Press, London, (1994) described the 
value of using sequence from both ends of sequencing templates to facilitate ordering of contigs 

10 in shotgun assembly projects of lambda and cosmid clones. We balance the desirability of both- 
end sequencing (including the reduced cost of lower total number of templates) against shorter 
read-lengths for sequencing reactions performed with the M13RP1 (reverse) primer compared to 
the Ml 3-21 (forward) primer. Approximately one-half of the templates are sequenced from both 
ends. Random reverse sequencing reactions are done based on successful forward sequencing 

15 reactions. Some M13RP1 sequences are obtained in a semi-directed fashion: M13-21: sequences 
pointing outward at the ends of contigs are chosen for M13RP1 sequencing in an effort to 
specifically order contigs. 

4. Protocol for Automated Cycle Sequencing 

20 The sequencing is carried out using ABI Catalyst robots and AB 373 Automated DNA 

Sequencers. The Catalyst robot is a publicly available sophisticated pipetting and temperature 
control robot which has been developed specifically for DNA sequencing reactions. The Catalyst 
combines pre-aliquoted templates and reaction mixes consisting of deoxy- and 
dideoxy nucleotides, the thermostable Taq DNA polymerase, fluorescently-labelled sequencing 

25 primers, and reaction buffer. Reaction mixes and templates are combined in the wells of an 

aluminum 96-well thermocycling plate. Thirty consecutive cycles of linear amplification (i.e.., 
one primer synthesis) steps are performed including denaturation, annealing of primer and 
template, and extension; Le. 9 DNA synthesis. A heated lid with rubber gaskets on the 
thermocycling plate prevents evaporation without the need for an oil overlay. 

30 Two sequencing protocols are used: one for dye-labelled primers and a second for dye- 

labelled dideoxy chain terminators. The shotgun sequencing involves use of four dye-labelled 
sequencing primers, one for each of the four terminator nucleotide. Each dye-primer is labelled 
with a different fluorescent dye, permitting the four individual reactions to be combined into one 
lane of the 373 DNA Sequencer for electrophoresis, detection, and base-calling. ABI currently 

35 supplies pre-mixed reaction mixes in bulk packages containing all the necessary non-template 
reagents for sequencing. Sequencing can be done with both plasmid and PCR- generated 
templates with both dye-primers and dye- terminators with approximately equal fidelity, although 
plasmid templates generally give longer usable sequences. 
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Thirty-two reactions are loaded per AB373 Sequencer each day, for a total of 960 
samples. Electrophoresis is run overnight following the manufacturer's protocols, and the data is 
collected for twelve hours. Following electrophoresis and fluorescence detection, the ABI 373 
performs automatic lane tracking and base-calling. The lane-tracking is confirmed visually. Each 
5 sequence electropherogram (or fluorescence lane trace) is inspected visually and assessed for 
quality. Trailing sequences of low quality are removed and the sequence itself is loaded via 
software to a Sybase database (archived daily to 8mm tape). Leading vector poly linker sequence 
is removed automatically by a software program. Average edited lengths of sequences from the 
standard ABI 373 are around 400 bp and depend mostly on the quality of the template used for 
10 the sequencing reaction. ABI 373 Sequencers converted to Stretch Liners provide a longer 
electrophoresis path prior to fluorescence detection and increase the average number of usable 
bases to 500-600 bp. 

INFORMATICS 
15 1. Data Management 

A number of information management systems for a large-scale sequencing lab have been 
developed. (For review see, for instance, Kerlavage et al. y Proceedings of the Twenty-Sixth 
Annual Hawaii International Conference oh System Sciences, IEEE Computer Society Press, 
Washington D. C, 585 (1993)) The system used to collect and assemble the sequence data was 

20 developed using the Sybase relational database management system and was designed to 
automate data flow wherever possible and to reduce user error. The database stores and 
correlates all information collected during the entire operation from template preparation to final 
analysis of the genome. Because the raw output of the ABI 373 Sequencers was based on a 
Macintosh platform and the data management system chosen was based on a Unix platform, it 

25 was necessary to design and implement a variety of multi- user, client-server applications which 
allow the raw data as well as analysis results to flow seamlessly into the database with a 
minimum of user effort. 

2. Assembly 

30 An assembly engine (TIGR Assembler) developed for the rapid and accurate assembly of 

thousands of sequence fragments was employed to generate contigs. The TIGR assembler 
simultaneously clusters and assembles fragments of the genome. In order to obtain the speed 
necessary to assemble more than 104 fragments, the algorithm builds a hash table of 12 bp 
oligonucleotide subsequences to generate a list of potential sequence fragment overlaps. The 

35 number of potential overlaps for each fragment determines which fragments are likely to fall into 
repetitive elements. Beginning with a single seed sequence fragment, TIGR Assembler extends 
the current contig by attempting to add the best matching fragment based on oligonucleotide 
content. The contig and candidate fragment are aligned using a modified version of the Smith- 
Waterman algorithm which provides for optimal gapped alignments (Waterman, M. S., Methods 
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in Enzymology 164:165 ( 1988)). The contig is extended by the fragment only if strict criteria for 
the quality of the match are met. The match criteria include the minimum length of overlap, the 
maximum length of an unmatched end, and the minimum percentage match. These criteria are 
automatically lowered by the algorithm in regions of minimal coverage and raised in regions with 
5 a possible repetitive element. The number of potential overlaps for each fragment determines 

which fragments are likely to fall into repetitive elements. Fragments representing the boundaries 
of repetitive elements and potentially chimeric fragments are often rejected based on partial 
mismatches at the ends of alignments and excluded from the current contig. T1GR Assembler is 
designed to take advantage of clone size information coupled with sequencing from both ends of 
10 each template. It enforces the constraint that sequence fragments from two ends of the same 
template point toward one another in the contig and are located within a certain range of base 
pairs (definable for each clone based on the known clone size range for a given library). The 
process resulted in 155 contigs as represented by SEQ ID NOs: 1-155. 

15 3. Identifying Genes 

The predicted coding regions of the Borrelia burgdorferi genome were initially defined 
with the program GeneMark, which finds ORFs using a probabilistic classification technique. 
The predicted coding region sequences were used in searches against a database of all nucleotide 
sequences from GenBank (July, 1997), using the BLASTN search method to identify overlaps 

20 of 50 or more nucleotides with at least a 95% identity (using default parameters). Those ORFs 
with nucleotide sequence matches are shown in Table 1. The ORFs without such matches were 
translated to protein sequences and compared to a non-redundant database of known proteins 
generated by combining the Swiss-prot, PIR and GenPept databases. ORFs that matched a 
database protein with BLASTP probability less than or equal to 0.01 are shown in Table 2. The 

25 table also lists assigned functions based on the closest match in the databases. ORFs that did not 
match protein or nucleotide sequences in the databases at these levels are shown in Table 3. 

ILLUSTRATIVE APPLICATIONS 

30 1. Production of an Antibody to a Borrelia burgdorferi 

Protein 

Substantially pure protein or polypeptide is isolated from the transfected or transformed 
cells using any one of the methods known in the art. The protein can also be produced in a 
recombinant prokaryotic expression system, such as E coli, or can be chemically synthesized. 
35 Concentration of protein in the final preparation is adjusted, for example, by concentration on an 
Amicon filter device, to the level of a few micrograms/ml. Monoclonal or polyclonal antibody to 
the protein can then be prepared as follows. 
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2. Monoclonal Antibody Production by Hybridoma Fusion 

Monoclonal antibody to epitopes of any of the peptides identified and isolated as 
described can be prepared from murine hybridomas according to the classical method of Kohler, 
G. and Milstein, C, Nature 256:495 (1975) or modifications of the methods thereof. Briefly, a 
5 mouse is repetitively inoculated with a few micrograms of the selected protein over a period of a 
few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen 
isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, 
and the excess unfused cells destroyed by growth of the system on selective media comprising 
aminopterin (HAT media). The successfully fused cells are diluted and aliquots of the dilution 

10 placed in wells of a microtiter plate where growth of the culture is continued. Antibody- 
producing clones are identified by detection of antibody in the supernatant fluid of the wells by 
immunoassay procedures, such as ELISA, as originally described by Engvall, E., Meth. 
Enzymol 70:419 (1980), and modified methods thereof. Selected positive clones can be 
expanded and their monoclonal antibody product harvested for use. Detailed procedures for 

15 monoclonal antibody production are described in Davis, L. et ai, Basic Methods in Molecular 
Biology, Elsevier, New York. Section 21-2 (1989). 

3. Polyclonal Antibody Production by Immunization 

Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein 

20 can be prepared by immunizing suitable animals with the expressed protein described above, 

which can be unmodified or modified to enhance immunogenicity. Effective polyclonal antibody 
production is affected by many factors related both to the antigen and the host species. For 
example, small molecules tend to be less immunogenic than others and may require the use of 
carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with 

25 both inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng 
level) of antigen administered at multiple intradermal sites appears to be most reliable. An 
effective immunization protocol for rabbits can be found in Vaitukaitis, J. et ai, /. Clin. 
Endocrinol. Metab. 33:988-991 (1971). 

Booster injections can be given at regular intervals, and antiserum harvested when 

30 antibody titer thereof, as determined semi-quantitatively, for example, by double 

immunodiffusion in agar against known concentrations of the antigen, begins to fall. See, for 
example, Ouchterlony, O. et aL % Chap. 19 in: Handbook of Experimental Immunology, Wier, 
D., ed, Blackwell (1973). Plateau concentration of antibody is usually in the range of 0. 1 to 0. 
2 mg/ml of serum (about 12M). Affinity of the antisera for the antigen is determined by 

35 preparing competitive binding curves, as described, for example, by Fisher, D., Chap. 42 in: 
Manual of Clinical Immunology, second edition, Rose and Friedman, eds., Amer. Soc. For 
Microbiology, Washington, D. C (1980) 

Antibody preparations prepared according to either protocol are useful in quantitative 
immunoassays which determine concentrations of antigen-bearing substances in biological 
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samples; they are also used semi- quantitatively or qualitatively to identify the presence of antigen 
in a biological sample. In addition, antibodies are useful in various animal models of 



potential vaccine target or as a means of evaluating the antibody as a potential immunotherapeutic 
5 or immunoprophylactic reagent. 

4. Preparation of PCR Primers and Amplification of DNA 

Various fragments of the Borrelia burgdorferi genome, such as those of Tables 1-6 and 
SEQ ID NOS: 1-155 can be used, in accordance with the present invention, to prepare PCR 
10 primers for a variety of uses. The PCR primers are preferably at least 15 bases, and more 

preferably at least 18 bases in length. When selecting a primer sequence, it is preferred that the 
primer pairs have approximately the same G/C ratio, so that melting temperatures are 
approximately the same ; The PCR primers and amplified DNA of this Example find use in the 
Examples that follow. 



5. Isolation of a Selected DNA Clone From B. burgdorferi 

Three approaches are used to isolate a B. burgdorferi clone comprising a polynucleotide 
of the present invention from any B. burgdorferi genomic DNA library. The B. burgdorferi 
strain B3 1PU has been deposited as a convienent source for obtaining a B. burgdorferi strain 

20 although a wide varity of strains B. burgdorferi strains can be used which are known in the art. 

B. burgdorferi genomic DNA is prepared using the following method. A 20ml overnight 
bacterial culture grown in a rich medium (e.g., Trypticase Soy Broth, Brain Heart Infusion broth 
or Super broth), pelleted, ished two times with TES (30mM Tris-pH 8.0, 25mM EDTA, 50mM 
NaCl), and resuspended in 5ml high salt TES (2.5M NaCl). Lysostaphin is added to final 

25 concentration of approx 50ug/ml and the mixture is rotated slowly 1 hour at 37C to make 

protoplast cells. The solution is then placed in incubator (or place in a shaking water bath) and 
warmed to 55C. Five hundred micro liter of 20% sarcosyl in TES (final concentration 2%) is 
then added to lyse the cells. Next, guanidine HC1 is added to a final concentration of 7M (3.69g 
in 5.5 ml). The mixture is swirled slowly at 55C for 60-90 min (solution should clear). A CsCl 

30 gradient is then set up in SW41 ultra clear tubes using 2.0ml 5.7M CsCl and overlaying with 
2.85M CsCl. The gradient is carefully overlayed with the DNA-containing GuHCl solution. 
The gradient is spun at 30,000 rpm, 20C for 24 hr and the lower DNA band is collected. The 
volume is increased to 5 ml with TE buffer. The DNA is then treated with protease K (10 ug/ml) 
overnight at 37 C, and precipitated with ethanol. The precipitated DNA is resuspended in a 

35 desired buffer. 

In the first method, a plasmid is directly isolated by screening a plasmid B. burgdorferi 
genomic DNA library using a polynucleotide probe corresponding to a polynucleotide of the 
present invention. Particularly, a specific polynucleotide with 30-40 nucleotides is synthesized 
using an Applied Biosy stems DNA synthesizer according to the sequence reported. The 
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oligonucleotide is labeled, for instance, with 32 P-y-ATP using T4 polynucleotide kinase and 

purified according to routine methods. (See, e.g., Maniatis et aL, Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor Press, Cold Spring, NY (1982).) The library is 
transformed into a suitable host, as indicated above (such as XL-1 Blue (Stratagene)) using 
5 techniques known to those of skill in the art. See, e.g., Sambrook et al. MOLECULAR 

CLONING: A LABORATORY MANUAL (Cold Spring Harbor, N.Y. 2nd ed. 1989); Ausubel 
et al., CURRENT PROTOCALS IN MOLECULAR BIOLOGY (John Wiley and Sons, N.Y. 
1989). The transformants are plated on 1.5% agar plates (containing the appropriate selection 
agent, e.g., ampicillin) to a density of about 150 transformants (colonies) per plate. These plates 
10 are screened using Nylon membranes according to routine methods for bacterial colony 
screening. See, e.g., Sambrook et al. MOLECULAR CLONING: A LABORATORY 
MANUAL (Cold Spring Harbor* N.Y. 2nd ed. 1989); Ausubel et al., CURRENT PROTOCALS 
IN MOLECULAR BIOLOGY (John Wiley and Sons, N.Y. 1989) or other techniques known to 
those of skill in the art. 

15 Alternatively, two primers of 15-25 nucleotides derived from the 5' and 3* ends of a 

polynucleotide of SEQ ID NOS: 1-155 are synthesized and used to amplify the desired DNA by 
PCR using a B. burgdorferi genomic DNA prep as a template. PCR is carried out under routine 
conditions, for instance, in 25 \i\ of reaction mixture with 0.5 ug of the above DNA template. A 
convenient reaction mixture is 1.5-5 mM MgCl 2 , 0.01% (w/v) gelatin, 20 |lM each of dATP, 

20 dCTP, dGTP, dTTP, 25 pmol of each primer and. 0.25 Unit of Taq polymerase. Thirty five 

cycles of PCR (denaturation at 94°C for 1 min; annealing at 55°C for 1 min; elongation at 72°C 

for 1 min) are performed with a Perkin-Elmer Cetus automated thermal cycler. The amplified 
product is analyzed by agarose gel electrophoresis and the DNA band with expected molecular 
weight is excised and purified. The PCR product is verified to be the selected sequence by 
25 subcloning and sequencing the DNA product. 

Finally, overlapping oligos of the DNA sequences of SEQ ID NOS: 1-155 can be 
chemically synthesized and used to generate a nucleotide sequence of desired length using PCR 
methods known in the art. 

30 6(a). Expression and Purification Borrelia polypeptides in E. coli 

The bacterial expression vector pQE60 is used for bacterial expression of some of the 
polypeptide fragements of the present invention. (QIAGEN, Inc., 9259 Eton Avenue, 
Chatsworth, CA, 91311). pQE60 encodes ampicillin antibiotic resistance ("Ampr") and contains 
a bacterial origin of replication ("ori"), an IPTG inducible promoter, a ribosome binding site 
35 ("RBS"), six codons encoding histidine residues that allow affinity purification using nickel- 
nitrilo-tri-acetic acid ("Ni-NTA") affinity resin (QIAGEN, Inc., supra) and suitable single 
restriction enzyme cleavage sites. These elements are arranged such that an inserted DNA 
fragment encoding a polypeptide expresses that polypeptide with the six His residues (i.e., a "6 
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X His tag") covalently linked to the carboxyl terminus of that polypeptide. 

The DNA sequence encoding the desired portion of a B. burgdorferi protein of the 
present invention is amplified from B. burgdorferi genomic DNA using PCR oligonucleotide 
primers which anneal to the 5' and 3' sequences coding for the portions of the B. burgdorferi 
5 polynucleotide shown in SEQ ID NOS: 1-1 55. Additional nucleotides containing restriction sites 
to facilitate cloning in the pQE60 vector are added to the 5* and 3* sequences, respectively. 

For cloning the mature protein, the 5' primer has a sequence containing an appropriate 
restriction site followed by nucleotides of the amino terminal coding sequence of the desired B. 
burgdorferi polynucleotide sequence in SEQ ID NOS: 1-155. One of ordinary skill in the art 

10 would appreciate that the point in the protein coding sequence where the 5' and 3' primers begin 
may be varied to amplify a DNA segment encoding any desired portion of the complete protein 
shorter or longer than the mature form. The 3' primer has a sequence containing an appropriate 
restriction site followed by nucleotides complementary to the 3' end of the polypeptide coding 
sequence of SEQ ID NOS: 1-155, excluding a stop codon, with the coding sequence aligned with 

15 the restriction site so as to maintain its reading frame with that of the six His codons in the pQE60 
vector. 

The amplified B. burgdorferi DNA fragment and the vector pQE60 are digested with 
restriction enzymes which recognize the sites in the primers and the digested DNAs are then 
ligated together. The B. burgdorferi DNA is inserted into the restricted pQE60 vector in a manner 

20 which places the B. burgdorferi protein coding region downstream from the IPTG-inducible 
promoter and in-frame with an initiating AUG and the six histidine codons. 

The ligation mixture is transformed into competent E. coli cells using standard procedures 
such as those described by Sambrook et al., supra.. E. coli strain M15/rep4, containing multiple 
copies of the plasmid pREP4^hich expresses the lac repressor and confers kanamycin 

25 resistance ("Kanr"), is used in carrying out the illustrative example described herein. This strain, 
which is only one of many that are suitable for expressing a B. burgdorferi polypeptide, is 
available commercially (QIAGEN, Inc., supra). Transformants are identified by their ability to 
grow on LB agar plates in the presence of ampicillin and kanamycin. Plasmid DNA is isolated 
from resistant colonies and the identity of the cloned DNA confirmed by restriction analysis, 

30 PCR and DNA sequencing. 

Clones containing the desired constructs are grown overnight ("O/N") in liquid culture in 
LB media supplemented with both ampicillin (100 |ig/ml) and kanamycin (25 |ig/ml). The O/N 
culture is used to inoculate a large culture, at a dilution of approximately 1:25 to 1:250. The cells 

are grown to an optical density at 600 nm ("OD600") of between 0.4 and 0.6. Isopropyl-(3-D- 

35 thiogalactopyranoside ("IPTG") is then added to a final concentration of 1 mM to induce 

transcription from the lac repressor sensitive promoter, by inactivating the lad repressor. Cells 
subsequently are incubated further for 3 to 4 hours. Cells then are harvested by centrifugation. 

The cells are then stirred for 3-4 hours at 4°C in 6M guanidine-HCl, pH 8. The cell 
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debris is removed by centrifugation, and the supernatant containing the ft burgdorferi 
. polypeptide is loaded onto a nickel-nitrilo-tri-acetic acid ("Ni-NTA") affinity resin column 

(QIAGEN, Inc., supra). Proteins with a 6 x His tag bind to the Ni-NTA resin with high affinity 
are purified in a simple one-step procedure (for details see: The QIAexpressionist, 1995, 
5 QIAGEN, Inc., supra). Briefly the supernatant is loaded onto the column in 6 M guanidine-HCl, 
pH 8, the column is first washed with 10 volumes of 6 M guanidine-HCl, pH 8, then washed 
with 10 volumes of 6 M guanidine-HCl pH 6, and finally the B. burgdorferi polypeptide is 
eluted with 6 M guanidine-HCl, pH 5. 

The purified protein is then renatured by dialyzing it against phosphate-buffered saline 

10 (PBS) or 50 mM Na-acetate, pH 6 buffer plus 200 mM NaCL Alternatively, the protein could be 
successfully refolded while immobilized on the Ni-NTA column. The recommended conditions 
are as follows: renature using a linear 6M-1M urea gradient in 500 mM NaCl, 20% glycerol, 20 
mM Tris/HCl pH 7.4, containing protease inhibitors. The renaturation should be performed over 
a period of 1.5 hours or more. After renaturation the proteins can be eluted by the addition of 

15 250 mM immidazole. Immidazole is removed by a final dialyzing step against PBS or 50 mM 

sodium acetate pH 6 buffer plus 200 mM NaCl. The purified protein is stored at 4° C or frozen at 
-80° C. 

The polypeptide of the present invention are also prepared using a non-denaturing protein 
purification method. For these polypeptides, the cell pellet from each liter of culture is 

20 resuspended in 25 mis of Lysis Buffer A at 4°C (Lysis Buffer A = 50 mM Na-phosphate, 300 
mM NaCl, 10 mM 2-mercaptoethanol, 10% Glycerol, pH 7.5 with 1 tablet of Complete EDTA- 
free protease inhibitor cocktail (Boehringer Mannheim #1873580) per 50 ml of buffer). 
Absorbance at 550 nm is approximately 10-20 O.D./ml. The suspension is then put through 
three freeze/thaw cycles from -70°C (using a ethanol-dry ice bath) up to room temperature. The 

25 cells are lysed via sonication in short 10 sec bursts over 3 minutes at approximately 80W while 
kept on ice. The sonicated sample is then centrifuged at 15,000 RPM for 30 minutes at 4°C. The 
supernatant is passed through a column containing 1.0 ml of CL-4B resin to pre-clear the sample 
of any proteins that may bind to agarose non-specifically, and the flow-through fraction is 
collected. 

30 The pre-cleared flow-through is applied to a nickel-nitrilo-tri-acetic acid ("Ni-NTA") 

affinity resin column (Quiagen, Inc., supra). Proteins with a 6 X His tag bind to the Ni-NTA 
resin with high affinity and can be purified in a simple one-step procedure. Briefly, the 
supernatant is loaded onto the column in Lysis Buffer A at 4°C, the column is first washed with 
10 volumes of Lysis Buffer A until the A280 of the eluate returns to the baseline. Then, the 

35 column is washed with 5 volumes of 40 mM Imidazole (92% Lysis Buffer A / 8% Buffer B) 
(Buffer B = 50 mM Na-Phosphate, 300 mM NaCl, 10% Glycerol, 10 mM 2-mercaptoethanol, 
500 mM Imidazole, pH of the final buffer should be 7.5). The protein is eluted off of the column 
with a series of increasing Imidazole solutions made by adjusting the ratios of Lysis Buffer A to 
Buffer B. Three different concentrations are used: 3 volumes of 75 mM Imidazole, 3 volumes of 
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150 mM Imidazole, 5 volumes of 500 mM Imidazole. The fractions containing the purified 
protein are analyzed using 8 %, 10 % or 14% SDS-PAGE depending on the protein size. The 
purified protein is then dialyzed 2X against phosphate-buffered saline (PBS) in order to place it 
into an easily workable buffer. The purified protein is stored at 4°C or frozen at -80°. 
5 The following alternative method may be used to purify B. burgdorferi expressed in E 

coli when it is present in the form of inclusion bodies. Unless otherwise specified, all of the 

following steps are conducted at 4-10°C. 

Upon completion of the production phase of the E. coli fermentation, the cell culture is 

cooled to 4-10°C and the cells are harvested by continuous centrifugation at 15,000 rpm 

10 (Heraeus Sepatech). On the basis of the expected yield of protein per unit weight of cell paste 
and the amount of purified protein required, an appropriate amount of cell paste, by weight, is 
suspended in a buffer solution containing 100 mM Tris, 50 mM EDTA, pH 7.4. The cells are 
dispersed to a homogeneous suspension using a high shear mixer. 

The cells are then lysed by passing the solution through a microfluidizer (Microfuidics, 

15 Corp. or APV Gaulin, Inc.) twice at 4000-6000 psi. The homogenate is then mixed with NaCl 
solution to a final concentration of 0.5 M NaCl, followed by centrifugation at 7000 x g for 15 
min. The resultant pellet is washed again using 0.5M NaCl, 100 mM Tris, 50 mM EDTA, pH 
7.4. 

The resulting washed inclusion bodies are solubilized with 1.5 M guanidine 
20 hydrochloride (GuHCl) for 2-4 hours. After 7000 x g centrifugation for 15 min., the pellet is 

discarded and the B. burgdorferi polypeptide-containing supernatant is incubated at 4°C 

overnight to allow further GuHCl extraction. 

Following high speed centrifugation (30,000 x g) to remove insoluble particles, the 
GuHCl solubilized protein is refolded by quickly mixing the GuHCl extract with 20 volumes of 
25 buffer containing 50 mM sodium, pH 4.5, 150 mM NaCl, 2 mM EDTA by vigorous stirring. 

The refolded diluted protein solution is kept at 4°C without mixing for 12 hours prior to further 

purification steps. 

To clarify the refolded B. burgdorferi polypeptide solution, a previously prepared 
tangential filtration unit equipped with 0.16 |im membrane filter with appropriate surface area 

30 (e.g., Filtron), equilibrated with 40 mM sodium acetate, pH 6.0 is employed. The filtered 

sample is loaded onto a cation exchange resin (e.g., Pords HS-50, Perseptive Biosystems). The 
column is washed with 40 mM sodium acetate, pH 6.0 and eluted with 250 mM, 500 mM, 1000 
mM, and 1500 mM NaCl in the same buffer, in a stepwise manner. The absorbance at 280 mm 
of the effluent is continuously monitored. Fractions are collected and further analyzed by SDS- 

35 PAGE. 

Fractions containing the B. burgdorferi polypeptide are then pooled and mixed with 4 
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volumes of water. The diluted sample is then loaded onto a previously prepared set of tandem 
columns of strong anion (Poros HQ-50, Perseptive Biosy stems) and weak anion (Poros CM-20, 
Perseptive Biosystems) exchange resins. The columns are equilibrated with 40 mM sodium 
acetate, pH 6.0. Both columns are washed with 40 mM sodium acetate, pH 6.0, 200 mM NaCl. 
5 The CM-20 column is then eluted using a 10 column volume linear gradient ranging from 0.2 M 
NaCl, 50 mM sodium acetate, pH 6.0 to 1 .0 M NaCl, 50 mM sodium acetate, pH 6.5. Fractions 
are collected under constant monitoring of the effluent. Fractions containing the B. 
burgdorferi polypeptide (determined, for instance, by 16% SDS-PAGE) are then pooled. 

The resultant B. burgdorferi polypeptide exhibits greater than 95% purity after the above 
10 refolding and purification steps. No major contaminant bands are observed from Commassie 

blue stained 16% SDS-PAGE gel when 5 jig of purified protein is loaded. The purified protein 

is also tested for endotoxin/LPS contamination, and typically the LPS content is less than 0.1 
ng/ml according to LAL assays. 

15 6(b). Alternative Expression and Purification Borrelia polypeptides in E. 

coli 

Tthe vector pQElO is alternatively used to clone and express some of the polypeptides of 
the present invention for use in the soft tissue and systemic infection models discussed below. 
The difference being such that an inserted DN A fragment encoding a polypeptide expresses that 

20 polypeptide with the six His residues (i.e., a "6 X His tag") covalently linked to the amino 

terminus of that polypeptide. The bacterial expression vector pQElO (QIAGEN, Inc., 9259 Eton 
Avenue, Chatsworth, CA, 91311) was used in this example . The components of the pQElO 
plasmid are arranged such that the inserted DNA sequence encoding a polypeptide of the present 
invention expresses the polypeptide with the six His residues (i.e., a "6 X His tag 1 ')) covalently 

25 linked to the amino terminus. 

The DNA sequences encoding the desired portions of a polypeptide of SEQ ID NOS: 1- 
155 were amplified using PCR oligonucleotide primers from genomic B. burgdorferi DNA. The 
PCR primers anneal to the nucleotide sequences encoding the desired amino acid sequence of a 
polypeptide of the present invention. Additional nucleotides containing restriction sites to 

30 facilitate cloning in the pQElO vector were added to the 5' and 3' primer sequences, respectively. 

For cloning a polypeptide of the present invention, the 5' and 3' primers were selected to 
amplify their respective nucleotide coding sequences. One of ordinary skill in the art would 
appreciate that the point in the protein coding sequence where the 5* and 3' primers begins may 
be varied to amplify a DNA segment encoding any desired portion of a polypeptide of the present 

35 invention. The 5' primer was designed so the coding sequence of the 6 X His tag is aligned with 
the restriction site so as to maintain its reading frame with that of B. burgdorferi polypeptide. 
The 3' was designed to include an stop codon. The amplified DNA fragment was then cloned, 
and the protein expressed, as described above for the pQE60 plasmid. 
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The DNA sequences of SEQ ID NOS: 1-155 encoding amino acid sequences may also be 
cloned and expressed as fusion proteins by a protocol similar to that described directly above, 
wherein the pET-32b(+) vector (Novagen, 601 Science Drive, Madison, WI 5371 1) is 
preferentially used in place of pQElO. 
5 The above methods are not limited to the polypeptide fragements actually produced. The 

above method, like the methods below, can be used to produce either full length polypeptides or 
desired fragements therof. 

6(c). Alternative Expression and Purification of Borrelia polypeptides in 
10 E. coli 

The bacterial expression vector pQE60 is used for bacterial expression in this example 
(QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, CA, 91311). However, in this example, the 
polypeptide coding sequence is inserted such that translation of the six His codbns is prevented 
and, therefore, the polypeptide is produced with no 6 X His tag. 

15 The DNA sequence encoding the desired portion, of the B. burgdorferi amino acid 

sequence is amplified from an B. burgdorferi genomic DNA prep the deposited DNA clones 
using PCR oligonucleotide primers which anneal to the 5' and 3' nucleotide sequences 
corresponding to the desired portion of the B. burgdorferi polypeptides. Additional nucleotides 
containing restriction sites to facilitate cloning in the pQE60 vector are added to the 5' and 3 ! 

20 primer sequences. 

For cloning a B. burgdorferi polypeptides of the present invention, 5' and 3' primers are 
selected to amplify their respective nucleotide coding sequences. One of ordinary skill in the art 
would appreciate that the point in the protein coding sequence where the 5' and 3' primers begin 
may be varied to amplify a DNA segment encoding any desired portion of a polypeptide of the 

25 present invention. The 3' and 5* primers contain appropriate restriction sites followed by 

nucleotides complementary to the 5' and 3' ends of the coding sequence respectively. The 3' 
primer is additionally designed to include an in-frame stop codon. 

The amplified B. burgdorferi DNA fragments and the vector pQE60 are digested with 
restriction enzymes recognizing the sites in the primers and the digested DNAs are then ligated 

30 together. Insertion of the B. burgdorferi DNA into the restricted pQE60 vector places the B. 
burgdorferi protein coding region including its associated stop codon downstream from the 
IPTG-inducible promoter and in-frame with an initiating AUG. The associated stop codon 
prevents translation of the six histidine codons downstream of the insertion point. 

The ligation mixture is transformed into competent E. coli cells using standard procedures 

35 such as those described by Sambrook et al. E. coli strain M15/rep4, containing multiple copies 
of the plasmid pREP4, which expresses the lac repressor and confers kanamycin resistance 
("Kanr"), is used in carrying out the illustrative example described herein. This strain, which is 
only one of many that are suitable for expressing B. burgdorferi polypeptide, is available 
commercially (QIAGEN, Inc., supra). Transformants are identified by their ability to grow on 
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LB plates in the presence of ampicillin and kanamycin. Plasmid DNA is isolated from resistant 
colonies and the identity of the cloned DNA confirmed by restriction analysis, PCR and DNA 
sequencing. 

Clones containing the desired constructs are grown overnight ("O/N") in liquid culture in 
5 LB media supplemented with both ampicillin (100 Jig/ml) and kanamycin (25 |xg/ml). The O/N 
culture is used to inoculate a large culture, at a dilution of approximately 1 :25 to 1 :250. The cells 
are grown to an optical density at 600 nm ("OD600") of between 0.4 and 0.6. isopropyl-b-D- 
thiogalactopyranoside ("IPTG") is then added to a final concentration of 1 mM to induce 
transcription from the lac repressor sensitive promoter, by inactivating the lad repressor. Cells 
10 subsequently are incubated further for 3 to 4 hours. Cells then are harvested by centrifugation. 

To purify the B. burgdorferi polypeptide, the cells are then stirred for 3-4 hours at 4°C in 

6M guanidine-HCl, pH 8. The cell debris is removed by centrifugation, and the supernatant 
containing the B. burgdorferi polypeptide is dialyzed against 50 mM Na-acetate buffer pH 6, 
supplemented with 200 mM NaCl. Alternatively, the protein can be successfully refolded by 

15 dialyzing it against 500 mM NaCl, 20% glycerol, 25 mM Tris/HCl pH 7.4, containing protease 
inhibitors. After renaturation the protein can be purified by ion exchange, hydrophobic 
interaction and size exclusion chromatography. Alternatively, an affinity chromatography step 
such as an antibody column can be used to obtain pure B. burgdorferi polypeptide. The purified 
protein is stored at 4°C or frozen at -80° C. 

20 The following alternative method may be used to purify B. burgdorferi polypeptides 

expressed in E coli when it is present in the form of inclusion bodies. Unless otherwise 

specified, all of the following steps are conducted at 4-10°C. 

Upon completion of the production phase of the E. coli fermentation, the cell culture is 

cooled to 4-10°C and the cells are harvested by continuous centrifugation at 15,000 rpm 

25 (Heraeus Sepatech). On the basis of the expected yield of protein per unit weight of cell paste 
and the amount of purified protein required, an appropriate amount of cell paste, by weight, is 
suspended in a buffer solution containing 100 mM Tris, 50 mM EDTA, pH 7.4. The cells are 
dispersed to a homogeneous suspension using a high shear mixer. 

The cells ware then lysed by passing the solution through a microfluidizer (Microfuidics, 

30 Corp. or APV Gaulin, Inc.) twice at 4000-6000 psi. The homogenate is then mixed with NaCl 
solution to a final concentration of 0.5 M NaCl, followed by centrifugation at 7000 x g for 15 
min. The resultant pellet is washed again using 0.5M NaCl, 100 mM Tris, 50 mM EDTA, pH 
7.4. 

The resulting washed inclusion bodies are solubilized with 1.5 M guanidine 
35 hydrochloride (GuHCl) for 2-4 hours. After 7000 x g centrifugation for 15 min., the pellet is 

discarded and the B. burgdorferi polypeptide-containing supernatant is incubated at 4°C 

overnight to allow further GuHCl extraction. 
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Following high speed centrifugation (30,000 x g) to remove insoluble particles, the 
GuHCl solubilized protein is refolded by quickly mixing the GuHCl extract with 20 volumes of 
buffer containing 50 mM sodium, pH 4.5, 150 mM NaCl, 2 mM EDTA by vigorous stirring. 

The refolded diluted protein solution is kept at 4°C without mixing for 12 hours prior to further 

5 purification steps. 

To clarify the refolded B. burgdorferi polypeptide solution, a previously prepared 

tangential filtration unit equipped with 0.16 |im membrane filter with appropriate surface area 

(e.g., Filtron), equilibrated with 40 mM sodium acetate, pH 6.0 is employed. The filtered 
sample is loaded onto a cation exchange resin (e.g., Poros HS-50, Perseptive Biosystems). The 
10 column is washed with 40 mM sodium acetate, pH 6.0 and eluted with 250 mM, 500 mM, 1000 
mM, and 1500 mM NaCl in the same buffer, in a stepwise manner. The absorbance at 280 mm 
of the effluent is continuously monitored. Fractions are collected and further analyzed by SDS- 
PAGE. 

Fractions containing the B. burgdorferi polypeptide are then pooled and mixed with 4 
15 volumes of water. The diluted sample is then loaded onto a previously prepared set of tandem 

columns of strong anion (Poros HQ-50, Perseptive Biosystems) and weak anion (Poros CM-20, 

Perseptive Biosystems) exchange resins. The columns are equilibrated with 40 mM sodium 

acetate, pH 6.0. Both columns are washed with 40 mM sodium acetate, pH 6.0, 200 mM NaCl. 

The CM-20 column is then eluted using a 10 column volume linear gradient ranging from 0.2 M 
20 NaCl, 50 mM sodium acetate, pH 6.0 to 1.0 M NaCl, 50 mM sodium acetate, pH 6.5. Fractions 

are collected under constant monitoring of the effluent. Fractions containing the B. 

burgdorferi polypeptide (determined, for instance, by 16% SDS-PAGE) are then pooled. 

The resultant B. burgdorferi polypeptide exhibits greater than 95% purity after the above 

refolding and purification steps. No major contaminant bands are observed from Commassie 

25 blue stained 16% SDS-PAGE gel when 5 |Xg of purified protein is loaded. The purified protein 

is also tested for endotoxin/LPS contamination, and typically the LPS content is less than 0. 1 
ng/ml according to LAL assays. 

6(d). Cloning and Expression of B. burgdorferi in Other Bacteria 

30 B. burgdorferi polypeptides can also be produced in: B. burgdorferi using the methods of 

S. Skinner et aL, (1988) Mol. Microbiol. 2:289-297 or J. L Moreno (1996) Protein Expr. Purif. 
8(3):332-340; Lactobacillus using the methods of C. Rush et aL, 1997 Appl. Microbiol. 
Biotechnol. 47(5):537-542; or in Bacillus subtilis using the methods Chang et aL, U.S. Patent 
No. 4,952,508. 



35 



7. Cloning and Expression in COS Cells 

A B. burgdorferi expression plasmid is made by cloning a portion of the DN A encoding a 
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B. burgdorferi polypeptide into the expression vector pDNAI/Amp or pDNADI (which can be 
obtained from Invitrogen, Inc.). The expression vector pDNAI/amp contains: (1) an E. coli 
origin of replication effective for propagation in E. coli and other prokaryotic cells; (2) an 
ampicillin resistance gene for selection of plasmid-containing prokaryotic cells; (3) an SV40 
5 origin of replication for propagation in eukaryotic cells; (4) a CMV promoter, a poly linker, an 
SV40 intron; (5) several codons encoding a hemagglutinin fragment (i.e., an "HA" tag to 
facilitate purification) followed by a termination codon and polyadenylation signal arranged so 
that a DNA can be conveniently placed under expression control of the CMV promoter and 
operably linked to the SV40 intron and the polyadenylation signal by means of restriction sites in 

10 the poly linker. The HA tag corresponds to an epitope derived from the influenza hemagglutinin 
protein described by Wilson et al. 1984 Cell 37:767. The fusion of the HA tag to the target 
protein allows easy detection and recovery of the recombinant protein with an antibody that 
recognizes the HA epitope. pDNAHI contains, in addition, the selectable neomycin marker. 
A DNA fragment encoding a B. burgdorferi polypeptide is cloned into the polylinker 

1 5 region of the vector so that recombinant protein expression is directed by the CMV promoter. 

The plasmid construction strategy is as follows. The DNA from a B. burgdorferi genomic DNA 
prep is amplified using primers that contain convenient restriction sites, much as described above 
for construction of vectors for expression of B. burgdorferi in E. colL The 5' primer contains a 
Kozak sequence, an AUG start codon, and nucleotides of the 5' coding region of the B. 

20 burgdorferi polypeptide. The 3' primer, contains nucleotides complementary to the 3' coding 
sequence of the B. burgdorferi DNA, a stop codon, and a convenient restriction site. 

The PCR amplified DNA fragment and the vector, pDNAI/Amp, are digested with 
appropriate restriction enzymes and then ligated. The ligation mixture is transformed into an 

appropriate E. coli strain such as SURE™ (Stratagene Cloning Systems, La Jolla, CA 92037), 

25 and the transformed culture is plated on ampicillin media plates which then are incubated to allow 
growth of ampicillin resistant colonies. Plasmid DNA is isolated from resistant colonies and 
examined by restriction analysis or other means for the presence of the fragment encoding the B. 
burgdorferi polypeptide 

For expression of a recombinant B. burgdorferi polypeptide, COS cells are transfected 

30 with an expression vector, as described above, using DEAE-dextran, as described, for instance, 
by Sambrook et al. {supra). Cells are incubated under conditions for expression of B. 
burgdorferi by the vector. 

Expression of the B. burgdorferi-HA fusion protein is detected by radiolabeling and 
immunoprecipitation, using methods described in, for example Harlow et al., supra.. To this 

35 end, two days after transfection, the cells are labeled by incubation in media containing 35 S- 
cysteine for 8 hours. The cells and the media are collected, and the cells are washed and the 
lysed with detergent-containing RIPA buffer: 1 50 mM NaCl, 1 % NP-40, 0. 1 % SDS, 1 % NP- 
40, 0.5% DOC, 50 mM TRIS, pH 7.5, as described by Wilson et al. (supra ). Proteins are 
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precipitated from the cell lysate and from the culture media using an HA-specific monoclonal 
antibody. The precipitated proteins then are analyzed by SDS-PAGE and autoradiography. An 
expression product of the expected size is seen in the cell lysate, which is not seen in negative 
controls. 

5 

8. Cloning and Expression in CHO Cells 

The vector pC4 is used for the expression of B. burgdorferi polypeptide in this example. 
Plasmid pC4 is a derivative of the plasmid pSV2-dhfr (ATCC Accession No. 37 146). The 
plasmid contains the mouse DHFR gene under control of the S V40 early promoter. Chinese 

10 hamster ovary cells or other cells lacking dihydrofolate activity that are transfected with these 
plasmids can be selected by growing the cells in a selective medium (alpha minus MEM, Life 
Technologies) supplemented with the chemotherapeutic agent methotrexate. The amplification, of 
the DHFR genes in cells resistant to methotrexate (MTX) has been well documented. See, e.g., 
Alt et ah, 1978, J. Biol. Chem. 253:1357-1370; Hamlin et al., 1990, Biochem. et Biophys. 

15 Acta, 1097:107-143; Page et al., 1991, Biotechnology 9:64-68. Cells grown in increasing 
concentrations of MTX develop resistance to the drug by overproducing the target enzyme, 
DHFR, as a result of amplification of the DHFR gene. If a second gene is linked to the DHFR 
gene, it is usually co-amplified and over-expressed. It is known in the art that this approach may 
be used to develop cell lines carrying more than 1,000 copies of the amplified gene(s). 

20 Subsequendy, when the methotrexate is withdrawn, cell lines are obtained which contain the 
amplified gene integrated into one or more chromosome(s) of the host cell. 

Plasmid pC4 contains the strong promoter of the long terminal repeat (LTR) of the Rouse 
Sarcoma Virus, for expressing a polypeptide of interest, Cullen, et al. (1985) Mol. Cell. Biol. 
5:438-447; plus a fragment isolated from the enhancer of the immediate early gene of human 

25 cytomegalovirus (CMV), Boshart, et al., 1985, Cell 41 :521-530. Downstream of the promoter 
are the following single restriction enzyme cleavage sites that allow the integration of the genes: 
Bam HI, Xba I, and Asp 718. Behind these cloning sites the plasmid contains the 3* intron and 
polyadenylation site of the rat preproinsulin gene. Other high efficiency promoters can also be 
used for the expression, e.g., the human B-actin promoter, the SV40 early or late promoters or 

30 the long terminal repeats from other retroviruses, e.g., HIV and HTLVI. Clontech's Tet-Off and 
Tet-On gene expression systems and similar systems can be used to express the B. burgdorferi 
polypeptide in a regulated way in mammalian cells (Gossen et al., 1992, Proc. Natl. Acad. Sci. 
USA 89:5547-5551. For the polyadenylation of the mRNA other signals, e.g., from the human 
growth hormone or globin genes can be used as well. Stable cell lines carrying a gene of interest 

35 integrated into the chromosomes can also be selected upon co-transfection with a selectable 
marker such as gpt, G418 or hygromycin. It is advantageous to use more than one selectable 
marker in the beginning, e.g., G418 plus methotrexate. 
• The plasmid pC4 is digested with the restriction enzymes and then dephosphorylated 

using calf intestinal phosphates by procedures known in the art. The vector is then isolated from 
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a 1% agarose gel. The DNA sequence encoding the B. burgdorferi polypeptide is amplified 
using PCR oligonucleotide primers corresponding to the 5' and 3' sequences of the. desired 
portion of the gene. A 5' primer containing a restriction site, a Kozak sequence, an AUG start 
codon, and nucleotides of the 5' coding region of the B. burgdorferi polypeptide is synthesized 
5 and used. A 3' primer, containing a restriction site, stop codon, and nucleotides complementary 
to the 3' coding sequence of the B. burgdorferi polypeptides is synthesized and used. The 
amplified fragment is digested with the restriction endonucleases and then purified again on a 1% 
agarose gel. The isolated fragment and the dephosphorylated vector are then ligated with T4 
DNA ligase. E. coli HB 101 or XL- 1 Blue cells are then transformed and bacteria are identified 
10 that contain the fragment inserted into plasmid pC4 using, for instance, restriction enzyme 
analysis. 

Chinese hamster ovary cells lacking an active DHFR gene are used for transfection. Five 
fig of the expression plasmid pC4 is cotransfected with 6.5 |Lig of the plasmid pSVneo using a 

lipid-mediated transfection agent such as Lipofectin™ or LipofectAMINE.™ (LifeTechnologies 

15 Gaithersburg, MD). The plasmid pSV2-neo contains a dominant selectable marker, the neo gene 
from Tn5 encoding an enzyme that confers resistance to a group of antibiotics including G418. 
The cells are seeded in alpha minus MEM supplemented with 1 mg/ml G418. After 2 days, the 
cells are trypsinized and seeded in hybridoma cloning plates (Greiner, Germany) in alpha minus 
MEM supplemented with 10, 25, or 50 ng/ml of methotrexate plus 1 mg/ml G418. After about 

20 10-14 days single clones are trypsinized and then seeded in 6- well petri dishes or 10 ml flasks 
using different concentrations of methotrexate (50 nM, 100 nM, 200 nM, 400 nM, 800 nM). 
Clones growing at the highest concentrations of methotrexate are then transferred to new 6-well 
plates containing even higher concentrations of methotrexate (1 |iM, 2 |iM, 5 |iM, 10 mM, 20 
mM). The same procedure is repeated until clones are obtained which grow at a concentration of 

25 100-200 nM. Expression of the desired gene product is analyzed, for instance, by SDS-PAGE 
and Western blot or by reversed phase HPLC analysis. 

The disclosure of all publications (including patents, patent applications, journal articles, 
laboratory manuals, books, or other documents) cited herein are hereby incorporated by reference 
in their entireties. 

30 The present invention is not to be limited in scope by the specific embodiments described 

herein, which are intended as single illustrations of individual aspects of the invention. 
Functionally equivalent methods and components are within the scope of the invention, in 
addition to those shown and described herein and will become apparant to those skilled in the art 
from the foregoing description and accompanying drawings. Such modifications are intended to 

35 fall within the scope of the appended claims. 
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TABLE 3. 

Borrelia burgdorferi - Putative coding regions of novel proteins not similar to know proteins 
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TABLE 6. 

Borrelia burgdorferi - Putative coding regions of novel proteins not similar to know proteins 



Contig ID 


ORF.ID 


Start (nt) 


Stop (nt) 


2 


• 4 


2730 


\ 3554 


2 


5 


3559 


3410 


2 


7 


5464 


3869 


2 


13 


10502 


9999 


2 


17 


13800 


13576 


1 2 


19 


15368 


15204 


2 


28 


21155 


21400 


- 2 


50 


- 41944 


42186 


2 


58 


53786 


52911 


2 


59 


54816 


53773 


2 


61 


57393 


55813 


2 


63 


57882 


57682 


2 


65 


60898 


60203 


2 


66 


61441 


62070 


2 


67 


62078 


62692 


2 


70 


65896 


66540 


2 


74 


70203 


69910 


2 


78 


71818 


71399 


2 


80 


72956 


74032 


2 


81 


1 73515 


73267 


2 


90 


92181 


92525 


2 


91 


92968 


92555 


2 


108 


109872 


110057 


2 


112 


1 12408 


112812 


1 2 


113 


1 12858 


113037 


2 


114 


113035 


113460 


2 


115 


113506 


113724 


2 


119 


114325 


114852 


I 3 


6 


3279 


4079 


3 


8 


5156 


6019 


3 


54 


42256 


42789 


- 3 


59 


47264 


47506 


3 


60 


47673 


48692 


3 


63 


51475 


51026 


3 


70 


60330 


60575 


3 


71 


61050 


61349 


3 


72 


61347 


61670 


3 


74 


I 63917 


64303 


3 


86 


75347 


75532 


1 3 


88 


76593 


77384 


3 


99 


89769 


89005 


3 


102 


91278 


91661 


3 


103 


92137 


92463 


3 


105 


92423 


92785 


L 3 


108 


93467 


93886 


3 


115 


98262 


98681 
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Borrelia burgdorferi - Putative coding regions of novel proteins not similar to know proteins 



3 


121 


102227 


102904 


3 


126 


111308 


1 10055 


4 


6 


3751 


4179 


4 


7 


4218 


5042 


4 


19 


16115 


15516 


4 


20 


17028 


16075 


4 


21 


17379 


17092 


4 


22 


17735 


17397 


4 


24 


19243 


18785 


4 


25 


18942 


19196 


! 4 


26 


20677 


19259 


4 


27 


19431 


19751 


4 


29 


21376 


20876 


4 


30 


21899 


21423 


4 


31 


22918 


21845 


4 


33 


23951 


23553 


4 


37 


,26253 


25627 


4 


38 


26991 


26332 


4 


39 


28181 


26931 


4 


40 


29175 


28522 


4 


43 


30605 


30342 


4 


45 


| 34906 


33548 


4 


48 


35750 


35932 


5 


3 


2102 


1527 


5 


5 


2656 


2393 


5 


7 


3460 


2900 


5 


10 


6544 


5645 


5 


40 


25278 


24322 


5 


41 


25235 


25600 


5 


42 


25665 


25276 


5 


. 44 


25881 


25663 


5 


47 


27883 


27410 


5 


48 


28351 


27881 


5 


49 


29028 


28324 


5 


50 


29454 


. 29026 


5 


56 


32199 


31666 


5 


57 


32571 


32200 


5 


58 


32826 


32569 


5 


60 


32913 


33245 


5 


61 


33766 


33575 


5 


62 


34173 


33742 


5 


64 


35514 


34861 


6 


2 


954 


1181 


6 


3 


1590 


1763 


6 


5 


3400 


3954 


6 


7 


4691 


5218 


6 


8 


5187 


5699 
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Borrelia burgdorferi - Putative coding regions of novel proteins not similar to know proteins 



6 


11 


6498 


5983 


6 


12 


6975 


" 6727 


6 


14 


7978 


7448 


6 


15 


8479 


7976 


6 


22 


15106 


15636 


6 


27 


19999 


18842 


6 


28 


20036 


20668 


6 


29 


21814 


20690 


6 


30 


20949 


.21269 


6 


35 


24136 


23630 


6 


37 


25697 


26248 


7 


8 


8100 


7792 


7 


10 


8145 


8288 


7 


11 


9374 


8517 


7 


12 


9771 


9325 


7 


13 


9652 


10185 


7 


14 


10163 


9765 


7 


15 


10517 


10173 


7 


16 


11363 


10524 


7 


17 


1 1904 


1 1392 

X X 


7 


18 


12495 


1 1902 


7 


19 


13516 


12473 


7 


20 


12807 


13154 


7 


22 


15149 


14697 


7 


24 


15855 


| 15046 


7 


25 


15503 


15826 


7 


26 


16638 


15853 


7 


27 


19344 


i 16636 


7 


31 


19473 


i 19727 


7 


32 


20067 


19675 


7 


33 


20762 


20049 


. 7 


34 


21136 


20738 


7 


36 


22975 


23406 


7 


40 


26667 


25870 


g 


3 


2907 


4118 


8 


5 


5898 
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( i ) APPLICANT : Human Genome Sciences, Inc. et 'al. 

(ii) TITLE OF INVENTION: Borrelia burgdorferi Polynucleotides and 

Sequences 

(iii) NUMBER OF SEQUENCES: 155 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville . . 

(D) :>STATE: Maryland 

(E) COUNTRY: USA 

(F) ZIP: 20850 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage 

(B) COMPUTER: HP Vectra 486/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: Herewith 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 



(viii) ATTORNEY / AGENT INFORMATION: 
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(A) NAME: Brookes, A. Anders 
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(B) REGISTRATION NUMBER: 3 6,373 



(C) REFERENCE / DOCKET NUMBER : " PB3 70 PCT 



(vi) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (301) 309-8504 

(B) TELEFAX: (301) 309-8512 

(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 910.715 base pairs 
... (B). TYPE: nucleic acid • * • 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear V, 

Cxi) SEQUENCE DESCRIPTION: SEQ" ID NO: 1: 
ATATAATTTT TAATTAGTAT AGAATATGTT AAACTTTACC CTTGAATTTT TCTACTCTAT 
TTGTATATTC TATAGAAAAA ACGATTAGAA TTAAACAAAG CCATAACTGA ACCAACGGTA 
ATTAGTAGAT AAAGGGATCA AAATATTTTT TATTGCAGCA AGAATACCTT GGTATATTAG 
AAAAACCAAA AGTCATAGTC AAATCATCTT TTGATAACAA TCCCCAAATC TATAATTTAT 
TATGAAATTA ATTGCTCCCT TGAAAAGATT AGTTTTTAAA ACTACAAGAC TACTATCAAT 
CACTATCAGA TAGATTAAAA CAACCTTTAC AAGAAAAAAA TCTTACTACT ATTTTATTGT 
AAATGTATTA TAAAATAAGT TCATGCAAAA 'ACTTACAATT TTTCACAACA AACTACAATA 
AAATCATGTA AACAAACAAT TTCTTTGAAA ATTAAGCAAA TTTATAAATA TAAATTATAA 
AGATATATAT TTTTATATGA TCAATAATAA AAATTAATAG GATACTTATT TGGAAAAATT 
ATTGAAAAAA CAATAAGCAT GAATTGCCAC AATAAGCTAA TTGTCACTTA ATAATTCTTG 
TTTACTAGAC CACATTAGTA TAAACTCAAA TATTGGCTAC TATAATATAG GGGCTTTATA 
CGCCACATGT TTAATGATAA CATAAGAAAA TATTGCAATA ATAAAAAGAT TGAAATATCT 
TTATTAGAAA AGAATCTCGA. TAATTTAGAA AACAGAATAA AAATCATAAC TAATAAATAT 
AACGTTGAAA AAAATATATT CAAACTTTAA CTATACAATT AATTACACCT TAAAAATGCG 
TTACATAAAA ATTAAGGACT ACTATAAATA GAAAACACCA CATAACCTAC AGACTCTAAA 
GGAATAATTA AATCCTCATA TTTCAGTTCT CCAAAAGTTT AAATAGGGGC CTTTTACTTT 



60 
120 
180 
240. 
300 
360 
420 
480 
540 
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720 
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TCTTGATTAG CATATACATT ATTAAAGGCA TCTTCTTGGG CACTATCCTA AACTTTTTTA 1020 

CATTATTATT ATTTTATTCT TTATTATTAC AAGATAATTC AAGAATCTAG ATTACAAGAT ,.10.80 

ATCAATCCTG CCATTAGTAG TTCAATAAAA CATTTAGAAT ATTTATACAT TATTTAATGT 1140 

ATTTTTTTCA TTTTTGAAAT AATATTGTTA TAACTTAACT TAATAAGATA TTTGATTTCT 1200 

TCAACTTGAG AATCCGATGT ACATAGAATC TGAACATCTC CTCTGCCCCA TTTGCCAATA 1260 

TTCTTAATAT ATCTAGTAAA ACCCTCTTTT AAAATTATTT GATCTAGAGC AACAGTAATA 132 0 

GTAATATTAA TTTTATTTAC N CCCAGGTCTA AAGCTAAAAT CTACAAAATA TCCGCCCTGT 13 80 

ACTTTAAATC CTGTATAGCA CTGTGTTTCA ACTTTCTCAA TTTCATTAAA ATTTAAAACA 1440 

AAAATAAAAT CTTC TAATTC TTTATATATT GCTTTCATAT CGGAATTTAA TTTTTCAAAT 1500 

TTTTTTAAAT TTTCGGTTTT AATATTATTA TGTTTTATAC CAGAATCTGT GTCATCTTCT 15 60 

ATGTCACTTT TCTTGCTGTT TACTAATACA TCGCTTTTTT TTTCATCAAA AAACATACTA 1620 

AAAATATTTT TAATAATATC ATTAAATATT TTATCTGAAT ATGTTTTTTT AAAACCAATT 1680 

TTAGCTTTAA AAAAATCAAG CAAATCAACA CTTGGATTTT TTGTTTCCTT TTTTAAATAA 1740 

GCTGAAAATT TGTC TGTATA TTTTTTTTCT AATGCAAAAG ATCTAGCCTC TTCAACATTC 1800 

AAAGAATTTC TAGAAAACTT TTTAAGATAT TCAAAATCCT TAGATGTTAA TTTTTCTAAA 1860 

TTAACAACCA TAAAAGGCTC ATTGTCTAAC AAATTATCTT TATCTAGGTC AGTATAGAAT 1920 

CTATATTCTA TGGCATCTGT TAATATACCA AATTCAACTC TCTTTGCTTG AGAACGAATA 1980 

v. • ■ - 

TTTTCAAAAT AAGGTTTTAA TTGC TTTAGA TGATTTTCAA GCTTTTCCCT GCTATTATGA 2040 

TATTTGGCCT CTATTAAAAT AGTGGGTTCT TCATCCTTTT TTGTTGGATA AATAACATAA 2100 

TCAACCCTTT TTAGTCCATC TTTAAGAATA TCTGCCTTCT CTTCAACTTT AACAATTGAA 2160 

ATATCAGTAT GATCATAGCC CATCGCATCT AAAAATGGAT CAATAAGATT TTGTCTTGTT 2220 

TGTGCTTCAT .TTTCAATAAG ATC CTTATCC TTTTGAATTT TTCTACTTAC AGCTTTTATT 22 80 

GAATTTTCAA AATTTATATC TTTGTATTCA TTTGGCATAA TTATATTTTA CCAATAAAAT 2340 

TAAAAATTAA TAATTCTAAA AATAAATTTC CAAAATGTTG TCTATTTTAA ACTCTTAACT 2400 

GATACCTTAA TTCTTTTTTC TACCTAATTT TTTAGTTTAA AATCTTATTT TTTAATTTTA 24 60 

TTATTTTTTC CTTACCTTAT TTATACTAAA ATTTTTAGTA TTTAGCGAAT AATTTTCATA 2520 

TCCTTTTATT AAAGACAAAA TATGATTTTC TCTTTTTTGT TTTTTAATAC CTTAAAATCA 2 580 

CTAAGCAAAG TAATAAAGTC TTC TTTGGTT AATGAATAAA AGACTAGCTA TAATAAAATT 2640 

•" *■ 

ATTTTATTTT TCTTTACTAA ATTCAAAATG CTCTAAATAA AGCAAATTAG AGAAATTCAA 2700 
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AGGATCATTT TTAGCTATTA GCAGAGAAGT GTTTTTTACC AAAGTTAGAC ATAATGAACT 27 60 

AGCCAAAATT TCTTCTTTGG GTTGAGGCAT TGGACATTGA CAAAGAAATG ATTTTACAAT 2 82 0 

GTCGGTATTT TAAAACAAAT CTTCTAATCA TAAAATCAAA TACAGTGCAT TGAAAATAGA 2 880 

TATAATAAAC AATTTTTTAT AAAAAGATAT TGGTATTTTC TCACAATTCA TATCTATTTT 2940 

ATAGAAACAC AATAATAATT TTTAGGAGAT AAAGTGCTAA TCATGGTTCT TTCATTTGTA 3 000 

TTGCTTGCAA TTCTTCTATA AAATATTCTT TCATTTGGGT ACTGATCATC TTTAGTTAAG 3060 
ATTTTTTCTA AATCTTCTTT ATATCCTATC CATAAAAGCT TATAACCTTC TTTTACATAA . 3120 

TCATAAGTAA AAAATC TTAA ATTAAATTGA TAGATATTAG CCCCAGAATA AAGAAATATA 3180 

AAGTTTTCAT TATTATATTC CTTTAATAAA GATTTGCGAT TCTTTATACT TGGATCTGGC 3240 

CCTTTTTTAA AATTAATATCT TTCTTTACTA AGAATACTAA ATGAACTAAA TATTTTGTTT 33 00 

AATTTGGCCC ATGTTTAATT CAATTCCTTT ATAAGGATTT TCTTTGCAGT CTTTTAAGTC 33 6 0 

TCTAGTTATT CCTTAATAAT ATTATCACTA CTTTGAATAA CAAATTTTGC TTTAAAATTT 342 0 

AATGTAAAAG TTTATTACTA CGAGGAAATA TCGCAAATTT AAAACTTGAA TGCATATCTT 3480 

AAAACCTTTT TTTGTTTTCA AACTGATAAA TAAGTTAAGT TTATAATTAC TAAATATATG 3540 

CTTTCTTAGC AAGC TAAGAC CAAATATCAC AATAGAAGTA ATTCTCAATA AACAAAATAC 3600 

AAAAAGTAGT TATCATATCG TCTTTAACCT TAAATAAGGT TGCTATAAAC AACCAAGATA 3 660 

TTTAATTTCT TTTAAAACCC TTATTCAATC TTTTTAAGCA TAGGATCTTA TAATTATAAG 372 0 

AATATAATTT TATTTACATC TCTATATTAA TAGAAAGATG CAAATATGTG ATCAAATTGT 3780 

TATTTTTGTA ATATGGAATA GTCCTTTATA GGGACGCTTA ATGCTCTATA CTTAAGATTG 3 840 

GAATTCTCTA TGAAAATATA TACTCGCTAC CCATGTAAAG CTGACTTATT TTAGCACGTA 3900 

TCGCTTAAAC AATTATATTT ATATTATCTT TTATAAAGTT AATTTTTTCT TGTAGATTAT 3 960 

TTTTTAATAA AAAAGGCACA AATTACCACA ACAAGTTCCA GTATAAATTA ATAGTTCTTA 402 0 

TCTCAACACT AAAGTACATA AACATCAAAT ATCAAAAATA TATAAGAACA ACATACTACA 40 80 

TTGTTTTAAT GAAAACCTTA AAAGGAATGG TTAAACTCTC ATTAAGCTAA AACCAATGCA 4140 

AAAATATCTT TATAAATTAG CAAAAGAACT AAAAGTCACA AACAACTACC ATAAAAATTT 42 00 

GGTAGTAAAT TCTGGAACTG AAATTT AC T A TAAACTCAAT TATTCTAAAA AAAATATTGC 42 60 

CTTAAATTAA AGAATGCCTT AAAAAAACAA AATGCTCTGA TTTAAACCTA TACCCAAAAT 432 0 

ACAAATTTAC TAAAGAAGAA GATATAGATT TAGAGAAGAT CTTAATAATA AAAATATTAA 4380 

TATAAAAGTT GCTCAGTATG CTAAAGGCAA AGAGTTTAAG TCAAGTTTAG AAATTACAAA 4440 " 

GAGTAAAACT ATAAACTTCC TTTAAGAATG AAAATTTATT TTTATACTTA CTTGGCTTAA 4500 
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TATTAAGATT TTTTTATTCT TTTCATAATA ATCTCTTCTA TCACTTAACA TTTTGC TATA 4560 

CAAAAATCTT ACACATCTAA ^ ATACTTTTTA AAAAAATTTG ATTAGTGTTA G AAT AT ATTC 4620 

TATATTTATA AACTTTATTA GCACTCATAA TTTTACTAAA TTAATATATT ATATTTAATT 4680 

TATTTTTAAA ATTTATCTCC ATTTACCAAA AAAACTAAAA TAAAACTCTC CAAACTTATA 4740 

AATAAAAAAA TAAGGCAAAA CCCCAACAAA CTCAAGATCT ATAATACAAA AATACAATAT 4800 

AAGAATCCCA AGCTTAAAAA CAACCCCCTA AAATCTTTTT TTATTGGCGT TTTTAAATAA 4860 

TGGTAATAAA GAATTCCAAT CAACACGATC CCCCCTACAA CTTTTCAAAC CCTATAGCTT 4920 

GGCTTTTTAT ATTATTTTTA AATTTACATG TCACAACAAT AGATAATGCA TAAAATAAGT 4980 

ATTAATAAAA CAAATACATT TATAGAACCT ATACAATTAT TGAGCATATG GCTAGTACTA 5040 

AAAATGAAAA TGTACAAGAT AATATGCTAT TAATAAAAAT T AATGGC T AC TAAAACTTTT 5100 

GAATCCACAT TTTTTCTTTA AAAAAAtTCT AAATTATTAA AATAAATAGA AATTAAAATT 5160 

ACCAAAAATA TTATTATAGT AATAAATATG TAAAGCTATT TTTATTAAAA C TG AT AAT AA 5220 

AAATATAATA GCTAAAATAA CATAAATTAA CTTTAAATTA TATCAAAGAC TTAGATTTAA 52 80 

AATATTTAAT AAAAGGCAAA GCTATAAACA CCATATACTT ATTTTATTAT TTTTTTCATT 5340 

TTATTTAAAT TAATTTAAAT AAGACTCAAT CAAATAATCA ATCAAACATA TTGGGTGAAG 5400 

AAAAAATAGG GTATTCTTGG TGAATCGTTT TAAAAGGGGG TATAGTAAGC TAAAAAACTC 5460 

TTATTAAAGA GGATGTTTAT AGACTTAAAA GTCTAATTCA ATATGAAAGA GGCTTTTTAA 5520 

AGCTAAAAAT GTTAAAGAAA ATCAAATTAA GCAACAAGAT GGTTTTGTTT CTATAAATAG 5580 

TTTTAAAGAA TATATACATT TGCACATACC CTTCATTATA ACATCTACTA ATTACACAAT 5640 

AAAAATAAAA ATGATTTATT AAGAATTATT AGTAACTTAT AAAAACTTTA TAAGTTACAT 57 00 

AGTCAAAAAT ATAAAAAATT AAAACAAAAA ATTAACGATA TGGAAAAATT GTATTTTATA 57 60 

GAAATAGAAA TATATTTGCA TTAAACAACT ATGAATTTAT AAAGATTC TA GTAGGAGAGA 5820 

AAATATGAAA AAAAAAAATT TATCAATTTA CATGATAATG CTAATAAGTT TATTATCATG 5880 

TAATACAAGT GACCCCAATG AATTAACTCG TAAAAAAATG CAAGACAAGA ACGTGAAAAT 5940 

TTTAGGATTT TTAGAGAAAA TTCAAGCAGA TAATAAAGAA ATTGTTGAAA AACATATAGA 6000 

AAAAAAAGAA AAACAAATGG TGCAGGCTGC TTCTGTAGCA CCTATTAATG TAGAGAGTAA 6060 

TTTCCCATAT TATCTTCAAG AAGAAATAGA GATAAAAGAA GAAGAGTTGG TTCCAAATAC 6120 

TGATGAAGAA AAGAAGGCAG AGAAGGCAAT TAGCGATGGG AGTCTTGAAT TTGCTAAATT 6180 

AGTTGATGAT GAAAATAAAC TTAAAAATGA ATCTGCGCAA TTAGAATCTA GTTTTAATAA 6240 
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TGTTTATAAA GAAATCTTAG AACTTGCAGA TTTAATACAA GCAGAGGTGC ATGTTGCAGG 6300 

AAGGATAAAT AGCTATATAA AAAAAAGAAA GACCACTAAA GAAAAAGAAT ATAAGAAGAG 63 60 

AGAAATTAAG AATAAGATAG AAAAACAGGC TCTAATTAAG TTGTTCAATC AGTTATTAGA 6420 

AAAAAGAGGC GATATTGAAA ATCTTCATAC TCAATTAAAT AGTGGACTTA GCGAGAGAGC 6480 

ATCTGCAAAA TACTTTTTTG AGAAAGCCAA AGAAACTTTA AAAGCTGCTA TTAC TGAAAG 6540 

ATTAAATAAC AAACGTAAAA ATCGGCCATG GTGGGCAAGA AGAACACATA GTAATTTAGC 6600 

AATACAGGCA AAAAATGAGG CAGAGGATGC TTTAAACCAA TTAAGTACTT CTTCTTTTAG .6660 

GATACTTGAA GCAATGAAAA TAAAGGAAGA TGTAAAACAG CTTCTTGAAG AAGTAAAATC 6720 

TTTTCTAGAT TCTTCAAAGA GCAAAATCTT TTCTAGTGGC GATAGATTAT ATGATTTTTT 6780 

AGAGACGAGT AAATAAAAAA ATATATTTTA AAGGCTAATA ACTTAAAATC AAAGTCTTCT 6840 

GTTAAAGGAA GACTTTTTTA TAATTTTATT TAAATAACGA AAAGCTTGAT AGTTAAAAAA 6900 

TCTTTTTTAT TAAAAATATG TTTACTAAAC AGAGCTCAAA . AATGACTATA TTTAGTATCT 6960 

CTATAAAAGA ATTTTTCAAT ATTTTAAAAA ATTTATAGAT AAACATAATC TAAAACCATG 7020 

CATTAATACA AACCTAAAAC AT AC TTGGTC ACTTGTAAAA GTAAATTGTA TCTAACTTTT 7080 

TTTATTTATT GAATATACGT AAAAATTCTT TATAATTTCT ATTTTAAAAC GCTGCTATTT 7140 

AGCAATACAA TAAAAGGCAT TACAGATTGC AATCAAACAA ACTAAAGTTT AAATAAAATA 7200 

TTACCCTCTG TTCTAATCCT ATCAAACAAG GTAATAAATT CTTTAAATTT CTAAAAGCCT 72 60 

AAACTTTAAA AGAACTTGTC GAAAATAATA TTTCTCTTAA AAAAGGTTCT AATCTTTTAT 732 0 

TTATAAGAAC TTTTATACTA TTATAAAAAT GTATCTTGCC TTGATATATT TGTATTCTTT 73 80 

ATAAATCAAG CCTTCTACTT TTTTTAAGAA TATTTCTATT TTTTATAAAC TAGTTTTCTA 7440 

CAATAGAAAA GAAATAACCC AAAGCCCTAA AAACTTAAAT AAATGTTAGC TATAATAACT 7500 

AAAATAGAGA TAAAAAACTC AATCATAAAT AATGGTAAAA C AAAC TT AAA CCACGTACCA 7560 

TAACTCAATC TGGATATCCC CAATACAGCC ATTATAACTC CGCTGGTAGG TGTTATCAAA 7 620 

TTAATAAGCC CAGATGCAGT CTGCATGGCA ATAACAACTG AAGCTCTTGG AATTGACAAA 7680 

AAATCGGCAA GAGGAGCCAT TATTGGCATA GTGAGACTAG CATGTCCTGA TGAAGATGGA 7740 

ACAACAAATC CTATAAATAT TTGAATAATT TCATTCAATA TGATAAAAAG GGGTC TTGG A 7 800 

AGATTGTATA AAAAATTAGT AGCAGCATTT AACATAGTAT CTGTAATCAA CCCATCATCA 7860 

CATACTATCA TAACACCTCT AGCAAGTCCA ATAACAAGAG CAGCGGTTAG CAGACTTTCA 7920 

GAACCTTTCA CAAACGCATC CCACATTTCA GTTTCACCTA ATTTACAAAT AAAAGCCGAT 7 98.0 

8040 



ATAATAGCAA CTCCAAGATA CAACATTGTC ATTTC TTGC A TCCACCAAGC AAGATTAACA 
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ATGCTAAATA TCAAAATCAA TATCATAAAT CCAAATAAAA GTAAAACTAA TTTATGAGCA 8100 
AAAGTAAACT CAAGAGCATT CTGAGCATTA TCTCCGGTAG AAAGTCCATC TTTTTTAACA . 8160 

AAATATTGAT AATGTTCATC TTTTTGAGAA TACACAAGCG ATTTTGAGGG ATCCTTTTTA 8220 

ATTTTAGACG CATAAACACA AACATAGGTT ATAGCAGCCA ATACTGATAC AAAATAAAGA 8280 

ACAATTCTAA AATAAAATCC ATCCTGCAAG CTAATAGAAG CTATTGCAGA TGCAATTCCT 8340 

GTCGCAAATG GATTTACAGT AGAAGCCATA GTTCCCACTC CAGCTCCTAA AGCAATAATA 8400 

GCCGCTCCAA CAAGACTATC ATAACCCAAA GCTACTATCA AGGGAATCAT AACAAAATAA 8460 

AAAGGAAGGG TCTCTTCACT CATTCCGGTT ACAGTTCCAC CAATTGAAAA AATAAACATT 8520 

AACAAAGGAA TAAGCAACTT ATCTTTGTGC CCCAACTTCT TGATTAAAAA ATAAATTCCC 8580 

ACATCTATTG CTCCAGTTTT CATAATAATC CCATAAGCAC CCCCAACAAT . TAAAACAAAA 8640 

ACAATAACTT CAACTGCATG TTCCATCCCC TTTGACATTG CGGTTAAAAT AGTCATAATA. 8700 

GGATGTAAAA ATCCCCTAGA GCCTCGATCT ACATATTGAT AAGTTCCAGC AACAATTATT 8760 

TCCCTTTTAG ATCCATCACC CATTTGCTTA AATTCTTTAT CAAACTTACC GGCAGGAATC 8820 

ACATACGTTA AAATGGTAAC AAATACAATT AAAGAAAATA TTATTGTAAA ACTACTTGGC 8880 

ATTTTGATCA TAACGTTTCT CCTAAATAAT TTCATAAATT TAATTTCACA TAAAAAATAC 8940 

TGTTATCCCA AAGTTGATAC CATAATAGCT TTAATGGTAT GCACCCTATT TTCAGCCACA 9000 

TCAAAAACAA CTGAATTTTT ACTTTGAAAA ATTTCTTCTG TAACTTCAAT TCCATCAAGT 9060 

CCGTATTTAT CAAAAATATC CTTACCAATC . ACAGTGTTTA AGTCATGAAA AGCAGGCAAG 912 0 

CAATGCATAA ATATTGCATC ATCTTTTGCC ATGCACATTA TCTCTTTATT AACCTGATAA 9180 

GCCTTTAGAA GATTTATTCT ATCTTCCCAA TTACTCTCCC CCATAGATAC CCACACGTCT 9240 

GTATACACAA CATCAGCACA TTTAACAGCC TCTTCTTTAG AATCTGTAAT TGTAATTTTA 9300 

CCCCCACTCT CTAGGGCTAA AGACCTAGCC TTT^AGCGTCA AATCGGGGTC TGGAAAAAGC 93 60 

TCTTTGGGAG CAAAAATTCT AAAATCAAGC CCCATAATAG CACAGCCTTT CAATAAAGAA 942 0 

TTAGCAACAT TCCCCCTACC ATCGCCACAA AACACTATTT TAATCCCTTT CAAACTCCCC 9480 

TTATGTTCTT TTATTGTCAT TAAATCGGCT AGTATTTGGG TTGGGTGAGA AATATCTGTC 9540 

AATCCATTGT AAACAGGAAC ATTAGAATAA TTCGCCAAAC ATTCAACAGT CTGTTGAGAA 9600 

AAGCCTCTAA ATCGAATAGC ATC ATACATG CGTCCCAAAA CTCTAGCGGT ATCTATCATA 9660 

GACTCTTTTG AGCCCATTTG ATTACCCTTA GATCCCAAAT AAGTAATATT TGCCCCTTGA 9720 

TCATAGGCTG CGATCTCAAA AGCACACCGG GTCCTTGTTG AATCTTTCTC GAAAATTATA 9780 
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Fl 163 _ ... 

AC TAT ATTTT TACCTTTAAG TrTTTTGCACT TCAATTCCTG CATATTTTGA CTTTTTTAAA 9840 

TTAATCGATA AATCAAGTAA ATATTTAATA TCTTTGCTTG TAAAATCTAA AAGATTT AAA 9900 

AAGCTCCTAT TTCGTAAATT ATACATCAAC CACCAACCTT TACAATCAAG TTTTTAAAAA 9960 

CTCATTTAAC TCATGCTTAA ACATGCTTAA ATATTAAATA TCCTCTCTTA CTAAAGACAT 10020 

AGACATGCAT CTTGGCCCAC CACGACCCCT TGAAAGCTCG CTAGACGGAA TTCTGTGAAC 10080 

TTTAATACCA TTTTCTTCAA ACAGCTTATT AGTTACATGA TTTCTAGAAT AAGCAATTAC J 10140 

TTCTCCTGGA GCTATCGCCA AAACATTAGC ACCATCATTC CATTGTTCTC TTGCACCATG 10200 

TATTAAATCT CCACCCGCAC ATTTTATTAT GTCAATTTTT CTGCCTAAAT AAAAGCTCAA 102 60 

AACATCTTTA AGCTTGGCTT TTTCTTTTTT AATATTAATT TTATTAGAAT TTGAATTGTA 10320 

AGTTAAAACA TAAATTGAGA AATACATATC ATCACTTGTA AAACTTGTAA AAACGCTATA 103 80 

ATCAATTTGG GTAAAAACTG TGTCTAAGTG CATATAGGCT CTGTTTTTTG GAATTTTAAA 10440 

AGCCAAAATT GTGCTAAATG GAGCCTTATT TTTAAAAAGA CTAGCAGCTA GTTTTTCTAC 10500 

AGACCCCGCT TCTGTTCTTT CTGAGATTCC AATAACCAAA AGATCTTTAT TTAAAACAAA 10560 

CTCATCCCCA CCTTCCAAAG AAGTTTCTTC CCATCTATTA AACCAAATTG GAACATTTTC 10620 

TTTGTAAGCG GAATGATATT TAAAAATATA CTCTGCAAAT ATTGTCTCTC TACGTCTAAC 10680 

CTTGGTATAC ATTTTATTTA TTGTAATTCC ATTGCCAATA CTGGCAAAAG GATCTCTGGT 10740 

AAATAAAACA TTGGGCATAG GATCAATAAC AAAAAGACTT G AAC C ATT AA CCCAATCATC 10800 

AAGCGAAAAT TCACAATCTT TAAGCTCTTC TCTTGCAACG GCGGAAATCA TTTTAGAAAC 10860 

CATATTATCA ACGGTTAAAT TAGAAAAATA ATCTTTTAAA ATATTAATTA CACCATCTGT " 10920 

TTTTATTTCT GCTTCCAGAA TAAATTGAGA TATAAATTTA TTTTTGAGCG CTACAGAAGA 10980 

AGCAAGAACT TCACTAACAA GATCCTCAAC ATACTCAATT TCAACTGAAT TATCTTTTAA 11040 

AATATTTAC A . AAAACTTCAT GCTCTTGTCT TGCAACTTTA AGATAAGGAA TATCATCAAA 11100 

TAAAAAATTT TTCATAATCA AGGGTGTCAA ATTTTCTAAT TCTTCTCCTG GCCTATGAAG 11160 

CAAAACTTTT TTCAAACGAC CTATTTCCGA AAATATATTT ATTGGATTTA AATATTCTTC 11220 

TTCCATCGAT TTCCCCCTTT ATGAAAATTG TCATATATTA AAATACTATA GTTTATATTA 11280 

AAAAACATCA ACTATTTTTA ATAATATTAA AAATATAATA TAAATATAAA AAATTGAAAA 11340 

AATAAAAGTT CTAAAAAACT TCAAATCAAA AACATAAACA AAAAATTATG CTAAAATACT 11400 

AATCATGAAG AATATTAATA GATTAATATT ATTAATATTA ACTACACACA CTTTATTATT 11460 

CTCTTGTGCC TTAATTGCAG ATAATAAGTC AAAAAATTTA AGCACATCAG AAATCATATT 11520 

AACACAAAAA ACACTACTAG AAAGCTCTTT AATAAAAAAT CCTTCTAATG TAGAATATCG 11580 
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AATACCAATA TCCAGTATCC AAGAAATTTT AAACAATAAC AATGATTC TT TTTTAATAAA 11640 

AAAAACAGCA GCAAAAATCA AAATAAGCCC TCAAAAACTT GAAGAAATAA AAAAC TATCT 117 00 

AAATGCTTAT AAAAATTATC TAAATAATGA AACAGAATGG ATAAAGTTTA TAGATCAAAG 117 60 

TAGCGTCAAT GGAAATTTAA CAATTAAAAT TGATACTGCT TTTGAAAAAA AAACAAATTT 11820 

TAATCATACA AATTCAGATA ATGAAAATTT AACAGAACTA ATAGAACTAC AAATGCATCT 11880 

GGAAAAAGAA ATTTTAAACT TAATTGAGCA AACATTTCAT GATAAAAATT TAGGATATAT 11940 

ACAATTAAGT CACATCAACT CATTCTTTCC TCAAGAAAAT ATAAACTCAA TAACAAAAGA 12 000 

AATAATAGAT GGAAAAGAAT ATATTGCACC GCACATAATA GCAAATCAAT TATTAAAAAT 12 060 

AAAAGATAAA AAATATTTTG AACAATTTAT GCACTTTTTA AAAGTTGAAA ACAGCAAAAT 12120 

AAAAACAATA ATTGAAAAAC AAAAAATTTC AGATCTTCAC AATGAACTGT ATTATTCAAA .12180 

ACAATCCCCG CCCAGAAGAA GAAAAAGGTC AACTGCCGAT TCCGATAATA ACAATAAATA .12240 

CGATAT AATA CCAAAAATAA TAGACCCAAA TACAGGCATT GAAATAACTC CTAAAAATTT 123 00 

AAGATCTATT TTATCAAATG GCGACATAAT ACTAATAAAA CCAAAAATAG ATTGGACAGA 123 60 

ATTTTTTTAT TTTTGGCAAC ATGTGGGAAT ATTTGATGAA GAAAAATATG AAGCCACTAA 12420 

AAAAATTGCA TTCAATGGAA TTGATAGCTT TGATATAAAA TCAATAATTA CAAGCAATCA 12480 

AATCAAATTC GATACAGCAT CTACTCAAGG TTCAGGATAC GAAAAGCTTT CAACATACGT 12540 

ACAATCAAGA ATATTAAAAA TATTCTCACC AATAACAGAC ATAAGAACAA TTCAAAAAGC 12 600 

TATTAATTTT GGAAGAAGTA GATACATTGA CAATAACTTT GGATATATGG TTCCATTAAT 12 660 

ATCCTCTAAT TTATGGACAG ATTCATTCAA TCTTGAAGAA ATTCACAACA AAACCTATTG 12720 

CTCTTTAATG GTTGATAGAA TATATAAAAT AGCAGGACTT ' AATGTATCAA GAAATTACGA i27 80 

AATTTCGGGA ATAATTACTC CTGGAGAAAT AAATGCAGCA GCTTACAATT TTTACATGTC 12840 

TTATACGATT GCAGGAATAC TTCCAAGCGT GCTTCCAAAA AGGCTCATTA AACGAACATT 12900 

AAAAGAAAAA TTCATTGGTT ACAATAAAGA AATAGTAGAT GCAATAGAAT TAAAAAAATC 12960 

GAAAGAAAAA ATTTTTGGGA GAGCTTGCAA CATTACAAAT CTCTGGTGCT CAGGAAGTTA 13020 

ATACTACCCA TGAACAAATA TTATGCCTGT ATTTTGATTA ACAGCAATGA AAAAATTATT 13080 

TTCAAATCCT GGGAAGAATG CAAAACCGCT ATTAAAGGAA AAAAC AATAA AATAAAAAGC 13140 

TTCAAAACAA TAGAACAAGC TCAAAATTGG CTATTTAATA ATGAGAATAA AATTCACCAT 13200 

CACCCAAATG GAATATATTT TGATTCTGGA ACGGGAAGAG GAAAGGGCAT AGAAATTAGA 132 60 

GTTGTAAACG AAAAAAGAAT TTCAATATTG GATAAAATCT TAGATAAATC CTTGATTAAT 13320 
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GAATATGGAA 
CTTGCCCTAT 
GACAGTAAAT 
CAAATTACTA 
GGTGGAAAAA 
ACTAAGTAGA 
TTTTTATTGT 
AGAATCTCCC 
TATTCAAATT 
TAACAACGAT 
TAAAAATAAA 
iTCCACTAAAA 
TCCGTCTATT 
TGTTTTAATA 
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GAGAATAAAT 
ATTTCCAATC 
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CAAAAACATT 
CTAAAAAACT 
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CAGCAGAAAT 
CAATCCCAGC 



J* 



165 

ATTATTATGT CAAAAATTTT CAAGGAATTA 
ATACAGCTCT CAAAATAGCA TTAAAAGAAA 
TAATAATTGA CTATTGGTCA AAAGGAATCT 
TTAATTTAAT GAAAAAGACA ACTGAACTAA 
TTTCTTTTAT TCCAGGAAAT GAAAATATTG 
AATATTGTCA AAAAATACAT AAAAACAATA 
TGTACGACAA TAAAAATAAA CCATGATTAT 
TCTAAATACA TCAATATAGA TGTAATTAAA 
ACAAACAATA GCTTAGACGT AGTAAAAATA 
AAGATCGTCT TAAAAAAAGA AGATCTTACA 
TACAGAGAGT . TTTTTATTGG TCCTAAAACT 
ATTCATTCTA AAAACAAAAA TAGCAATAAC 
TTT AAGCTC A ACATAACAAA AGTAGGAATT 
ACAAGAACTA CAAAAATTAA TATTACTAAT 
TCTATTAATA ATAAACTCGT AGTTTGTATC 
AGCAAAAAAG TTCCAAGAAA TATAAATTGT 
CCCATCAGAT CCCCTTAATA AATCTGGTTC 
ATAAAAATTG AATTTAAAGC / CTGATGAAAA 
GTCTTGAGAA TTAAAGAAAT TGAAAGATTT 
TAGACCAATT TGGTCCATAT ACCCTTTAAA 
AGTAGAAAAG TAAATTTCTA AAAATTCTGT 
AAGTTCATTA TCCGTAAATT TCTGCAAATT 
AAAAGAAAGC TTATTGTCAA AAAAAGTTAA 
TAAAGAATAT GGAACAAGTT TGGTTGTAGT 
ATCATAATTA TATTCAAAGT CGTCTTTCAT 
AAGCTTAAAA GAAAGTTCAG AAACTCTATT 
AAATTTAAAA TAATCCAAAT ATCTCGGCTC 
TAAATTTTTA TAAGGCGATG ATGGTTTTTG 
TCCAGAGTTT TTCATAGCAT CTTCTTTAAA 
TTCTTGTAGC AAATAAGGAA AATCTAAAGA 
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GGAATAATTT TGGGGAACTG 13380 

ATATAATAAA CATATTTGGG 13440 

ATAATAGCAA AAAATTAAC A 13500 

GGAAAAAATT TGAAGAACAA 13560 

CAGATCTTGG TTTTCATAAA 13 620 

TTTCTGATTT CAATGGTTTA 13 680 

GAAACTGATT TTAAAGTTCT 13740 

GCTACAAATG AATATATTTA 13 800 

AATTGGCAAA AC AC TAGTCT 13860 

ATAAACAATG AAAC AGGGT A 13920 

TCATTTAAAT TTAAAGTATA 13 980 

TT AAGCTC AA CTATTAAATA 14040 

GAAGCAAAAA AAACAATAAA 14100 

AAATG AAAAT C ATTATTATT 14160 

TTTGTTGTTT TCAAATGACG 14220 

AAAATTATTT CTCCAAATAG 14280 

TAAATTATAT TCTCCAACAA - 14340 

TTTTTTAATT TTAAAAAGTG 14400 

TGATAAATCA ACAAAGAAAT 14460 

ATATTTAAAA GTCTTAGTAT 14520 

ATATTTAAAC TTCAAAGTCA 14580 

TATTTTCCAA CCAACATCTA 14640 

AACGTACAAT TCCTTTTTGT 14700 

ACCAATCTTG GAAAAATCTC 14760 

AGCAAACAAA AATTG AAAAT 14820 

TATC AAAGGA ' TC ATAGGCGA 14880 

AATTTTATAA TACAAAGCAG 14940 

AGGCTCCAAA GGACTTTGAA 15000 

CTTTTTATAA TATTTAATTC 15060 

AAGTTTAAGC TCAGAAGAAG 15120 
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CTTTAATATC TTCA!AAACTA TTTTTTAATT CACCTGAAAG CTCAGTAGTA AAATAATCAT 
AATCATAAAT TAAAGAGGCT GTTAAACTTT GATAAAAAGT TTCCGGATCA GATAAAAAAA 
TACTACTATT CTTATTAACC AAAGATTTTA CATCAGAATC ATATTTTTTA TTAAATGAAT 
ATAAAGTAGC CTTATTTTCA AACTTTAAAG TACTTCTAGA AAATAAAGGA TATCTAATAA 
AAGGAAGCAA GTTTAAATTT ATTTGGTTAA TAATAGAGTG CTCACTTTTT TTATCTTTAT 
CTTCAACTTT AAAATCTTTA TTTAAAGGAC TATACTCAAT AGTATTAAGA TATAATAAAT 
TTTCAAAAGT AATTAAACGA TTGTAAAAAT CAGCATGAAT TTTTATATCC GTTTTATTTT 
TTATATCAAA TAAATAATTT TTTATTTCAT AATTAAAGTC CTTTGGACTT GTTATGCGAT 
AATTATCAAA AAAAACATTA TTTCTTAAAT AAGGATTAAT GCCAAACCTA ATAAAAAAAG 
AATCGGATTG ATCAATATTT TTTAAAGTAA TTGGTTCTGG AGGAATATAT AAATCTTTGG 
TTAATTCTGT TGTTTTTTTA GTATTTTTCT CCTTCACACT CTTTTTATCA TTATCTTTAT 
CTTCTAGATT TTTAATTTCT GGGCGCATTA TCATTTCTTT AGTATCAGCT GGAAATGTCC 
ATTGGTTATT GTAAAGATCT TTTTGAAAAT TCAAATCAAT ATATGGAGCA TAAATTCTCT 
CCAAATAAAA CCATTTTCTT GTAGGATCAT TAACATCTTT TGGTTTCTCT AAAGGAGATT 
TAACATAAAG ATTTTCATAG CCCGACAATT TAAAACTTAA ACCTAAATTA TTTAATTTAT 
AATCTAAAAT CGAACCGTCA TTAAATGTTC GCTTATAAAA AGAAGATAAA TTCCAATCAA 
AAGTGCTAAT GCTAGTTTGC TCTTTAACCG AATCTTTATC TAAATTTAAA AGAGAAAAAA 
ATGTAGCACT TTC TATCCTA TCTCTAAAAT CAATATTAAC 'ATACGGGTCA GAATAGTGCT 
CTAAAACAAC CGAGAAAAGT GCATCACTTA AAAGAAATTC TGTTTTAAAT TTAAATAAAT 
ATCTAAAAGG AACTTCAAAC CCAAATACAT CTCCTTTGTT AAGATTGGAA AAACTAAAAA 
GAGATTGTTT TAAAGTCCTA TTATCAAAAG GATAATATCC TCCATCGTAA CTATAAACAT 
TCCTGGTAAA ACCCAATCCA AAATTTCCTT CCAAAGTTTT AAAATGCCCC AAAGTATTGC 
CCAAATTAAA ATCAATTCCA GAATAAAATC CCAGATTAGC ATAAATGTCA AAAATAAGCT 



TAACATAATC TTTATTAACA CTGGGTGCTA AATTTTCTGC AAAAAAATAA GTTAAATATC 
CATTTCTTAT ATAAGGTTTT TTACCCGAAT TATAAACAGA ATTGAAATCA AAATCCAAAA 
AAGAAGAATC TTCACTTGAA GATTTATTAC CAAAAAGATA AACGGTATTA AAAACAGAAA 
AACCTTTTCG TGGATTTAGA CCTAAAGATG GATTAAAAAA CAAACTATCT CCCGGTCTGA 
AAAAAAAAGG AATATAAAAT ACTGGAACTC TTCCCATGTA AAATATGGCA TTTAAAAACC 
CAAAATCTCC CGAGGGCAAT GCCCATATTT TAGAAGCCTT GATTGAATAG TAAGGCTCTG 



15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
16140 
16200 
16260 
16320 
16380 
16440 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
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GAATTTTACT 

TTAAAACCTT 

GAAGAATACC 

CATAAAAATA 

CATAAAGTTT 

TATTTTCTTT 

CTTTAGTAAG 

ATCTACCAGA 

TTTTTAAAAG 

CCCATTTTTT 

TTAAAGTTAG 

CATTAGAAAA 

ATAGGAATTC 

TGCCAACCCA 

ATTATGCTCT 

CAACTTCTTT 

TTTACCCGTG 

TC3ATAAAGTT 

TTTTAAATAA 

CATCTCTAAA 

GAATTTTTTA 

AATATTAAGA 

AAATCTGCCG 

AATTTCTGTA 

TTTTTGATTT 

ATCACAATAC 

AAGTAAAGTA 

GATACTTAAG 

CTTATCTGCC 

ACCAGTTAAA 



AGTTGTTGCA AAAGCTTGTT 
TCCTCCAAAC GAAAGAATAT 
ATTTTTTAAT AAAAAATTTT 
AAGCTTTTCA TTGGTATCCA 
TTTAGAGTTC TTATTAAGGA 
AATATCTTCA ACCAAGATAT 
TCCATAAGTG AAATTTTCAA 
TCCGGC AAGT CCCTTTCCTT 
CAATTCTCGT ATTTTTGAAA 
TAAATCCTCA TCGGTTGAAA 
CTTATCCCTT TTTTTAGAAT 
TGTTAAAAAA ATTA&AAATA 
TCGCATTTTG CAACCTCTTC 
CCATCAGGAC CCAAATCTAT 
ATTAGTACAA CTGTATTACC 
ATGTCATCAA AATGCAGCCC 
CTCTTTTTAC TTAACTCAAA 
GTTGCAGATT GTC CTAATTT 
TGACTAATTT TTGGGACATT 
ACATCATGTA TATTTTTTCC 
CCCTTACATA AATCACAAGG 
TACCCATCTC CTTGACATTT 
GCTTTAAAAC CCCTTGACTT 
AAAAATCCAA CATAAGTTGC 
ATTTGAATAA TTTTATCGAT 
TTTTCATTAA GCTTTAATCT 
CTTTTTCCGC TACCAGAAAC 
TCTATATTTT TAAGATTATT 
TTTCTTCTAG AGCTTGGAAC 
CTATTTTTGC TATTTAAAAT 
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CCAAAATGGT AACATCATTG TCTATCTTTT 16920 

GATCTATTTG ATTTTTTTGC ATTTTTTTTT 16980 

GAGAATGAAA ATCGACAAGA AATTCATTGC 17040 

TATCAAGAAT ATATTCAACA TTTCCAATAG 17100 

CTATTCTGTC GCCTTTAATA TTGTGCTTTT 17160 

TAACTCTTCC TTCAAAAATA ATACTTTCAT 17220 

GATTATCTGC AGTTTCAATG ATTATTTTAT 172 80 

TGATAAAAAG CTCAGGATCT ATTCCAAACT 17340 

CATCTGTTTC TTTTAAACCC TCTTTTAAGG 17400 

GCTCAAGTTC TCTTAAATAA GATTTTTGAC 17460 

TTTCATCATC TATAGTCTGG GCAAAAATTG 17520 

CTATAAAAGA TTTTTTAAAA ACATTCCTGT 17580 

AGGAATACCA GAAACAACGA TATTTCCCCC 17 640 

TATATAATCT GCCTGTTTAA TTACATCCAA 17700 

ATTGGAAACT AACCGCTGCA AAACCTCTAA 17760 

AGTTGTTGGT TCATCAATAA TATAAAAGGT 17820 

AGCCAACTTA ATGCGCTGAG CTTCTCCTCC 17880 

AATATATTCA AGTCCAACTT CAATTAAAAA 17940 

CTCAAAAAAT TTACTTGCCT CAAAAACACT 18000 

TTTGTATCTA ACTTCTAAAG TTTCTTCATT 18060 

AACAAAAACA TCTGGTAAAA AATGCATTTG 18120 

CTCACACCTT CCACCTTTAA CATTAAAAGA 18180 

TGCATCTGGA AGCTTGGCAA AAAGCTCCCT 18240 

TGGGTTTGAT CTTGAAGTTC TCCCTATTGG 18300 

TTTTTCATAC CCAACAATAT CTTTAAAGCC 183 60 

ACTATCAAGA GCTGGATATA ACACCTCGTT 18420 

ACCTGTTATT ACGGTAAAAA CTCC C AAAGG 18480 

TTTATTAGAG CCCAAAAGCA AAATTTCTCC 18540 

ATCTATTTTA AACTTGCCGC TAAGAT ATTG . 18600 

ATCAATCAAG GCTCCCTTTG CAACTATTTC 18660 



WO 98/58943 

CCCTCCAAGA 
TTGCTCATCA 
AGAGATTAAT 
AATAACACCC 
ACCAGATAGA 
AAATTTAAGC 
ATCAAGCTGC 
TTGAATGTCT 
GCATGAATTA 
TTCTGTTGCA 
ATGAAATCCT 
TTTTTCATTT 
AGTATTTAAG 
TGAACTTGTC 
AATGCTCTCA 
AAATGGACTA 
AGGACAACTG 
CACTCTTAAA 
TCGAACATTA 
TTTATGTAAA 
TCTATTAAAA 
CCTTACAATT 
AACTATTTTA 
ACCAATTTTT 
AGAGCGGGGA 
TATATAATCA 
AGATTCCATA 
GCCAGAGCCA 
ATTTTTTAAA 



ATTCCAGCAC 
TGTTCAACAA 
TTTTCATTAT 
GAAAGTGCTG 
CTACCTGATA 
CTACTTTTAA 
AAGTTTTCAA 
TTTCCATTAA 
CATATTTTTT 
AGATATCGCC 
CCATCTAGCT 
GAGCCGTATA 
TCAAAATTAT 
TTAAACGTAA 
AAATCAAACT 
TTAAATGAAA 
TTGTGCTCTG 
TATCCATTAG 
TTACCAAGCT 
TTTAAATTAA 
CCTTGATTTA 
GGTGCAAAAA 
TCTAAAGATT 
GCAAATATTA 
TTATTGCTTA 
ACATTGGGTT 
TACCTTCTTT 
CTCTTGCCAG 
TTATGTTCTT 



CAGGACCCAT 
CAATTACAGT 
CTCTTTGATG 
ATCCTATTTG 
TTCTATTTAA 
TTTCCTTTAA 
AAAATACATA 
TTTTCACAGT 
TAGACATCAA 
TTTTTAAAAG 
CTTTTGCTTC 
AAATCTGTTT 
AATGTTTAGC 
GAAAAGCATC 
CAAGTGTAAC 
AAAGTCTGGG 
TAAATAGTTT 
AAACAGCAAG 
TAATCCTATC 
GTGCATCTTC 
ATATTTTTTC 
GTATAACCTT 
GCTCTTCTAT 
GTCTATAGTA 
TTGTTCTCTG 
TTTTCATTAC 
GCCCTTCTGC 
ATATTACAAC 
TTGCTCCTCT 
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ATCAATAATA TAGTCCGCAG TACGCAAAGT 18720 

ATTACCAAGA TTTTTAAGAT TAACAAGAGT 18780 

AAGACCAATA CTTGGCTCAT CAAGAACATA 18840 

AGTAGCAAGC CTAATACGCT GAGCCTCGCC 18900 

ATATAAATAA GAAAGGCCAA CATCAATTAA 18960 

AATTTCTTTA GATATTTTTT CGTCCACCAT 19020 

AGAATCAAAT ACTGACAAAT TGGTAAGATC 19080 

TAAAGCTCCA ACGCTTAGGC GTTTACCTTT .19140 

ATTTTCGTAA AAAATTTTAG TACTCTCTGA ; 19200 

GGGCAAAAGT CCTTCAAATG TTTTAGAATA 192 60 

CATTTCTTTG GACTGGTAAA TAAAATCTAT 1932 0" 

AAGAACTTTA TCTGGAATGT CTTTTATGGG 19380 

AAGTCCTTTA AAAATAGCCA CAGACCAAGA 19440 

ATCATTAAAA GAAAGACTAG TATCAGGACA 19500 

GCCAAGACCA GAGCACTCAC TGCAAGCACC 19560 

TTCTATAAGA GGAAGTGAAA ATCCACACAA 19620 

GTCTATTTTT TCCAAATCAT TATCAATTTC 19680 

AGAAGTCTCA ATAGATTCCG CAAGTCTAAC 19740 

AACTATAATT TCAATGGTAT GTTTTTTATT 19800 

TATTAAATAA TCTTCAGAAT TTATCCTAAC 19860 

TAAAACCTTT TTATGAGAGC CTTTAGACCC 1992 0 

GGATCCTTCA GAATAACTTA AAATAGTATT 19980 

TAATCTACCA TCATTTGGAC AGTATGCTTT 20040 

ATCATAAATC TCAGTAATTG TTCCAACAGT 20100 

CTCAATAGCT ATAGAAGGAG AAAGTCCATC 20160 

ACCTAAAAAC TGCCTTGCAT AAGCTGAAAC 20220 

AAAAATAGTA TCAAAAGCCA GAGAAGACTT 20280 

TAAACCATCT TTTGGAATAT CTACATCAAC 20340 

GACAATAATT . TTTTTTTTCA AACTTTTTTC 20400 
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CAAAAATTAC ACCTCTCTTT TTTTATTACG AGCTATACTA ATTTTGCTAC TAAGCTCTTT 20460 

TATTTTATCT CTTAAAACAA TTGCGTCTTC AAATCTTTCA TCATTAACAG CTTCTTCTAA 20520 

GTCAAATTTA AGCTTATCAA TAAGCTTTTT TTTAGACAAT ,CTCTCACCCG AAATAATTTT 20580 

TTCAAAATCA TAGCCAACAT TTTTATTTTT ATTATTAAGT TCCTTTTCTA AAATATTTTG 20640 

AATCTTTTTA ACAATTGTCT TAGGAGTAAT ATTATTTTTT TTATTATAAT CAATCTGAAT 20700 

TTGACGTCTT CTATTAGTCT CCTCAATTGC CTCCCGCATA GCTAAACTAA TTTTGTCGTA 20760 

ATACATTATT ACAAGTCCAT TAGAATTTCT AGCAGCCCTA CCAATTGTTT GTATTAATGA 20820 

AGTAGTAGAT CTTAAAAATC CCACCTTATC AGCATCTAAT ATTGCAACAA GAGATACTTC 20880 

TGGAATATCT AAGCCCTCTC TAAGCAGGTT AATCCCAACA ATAACATCGA TTTCAGATTT 20940 

TCTAAGCAAC GAAATAACTT CC ACTCTCTC AAGGGTATCA AGCTCTGAAT GTAAATATTT 21000 

TGCCCTTACG CCAAGATTTA CCAAATATTC AGTCAAATCC TCAGACATTT TTTTTGTCAA 21060 

AGTAGTAATT AAAAGCCGCT CTTTAAGAGC CACTCTTTTT TGAATTTCGC TGTAAAGATC ■-" . 2112 0 

TTCCATTTGC CCATCAGAGT GCCTAGTAAT AATTTCAGGA TCAACAAGAC CTGTTGGACG 21180 

AATTATTTGG TCAACAACCA CACTACTTTT CTCATTCTCT TCAACACCCG GGGTTGCAGA 21240 

TACAAACACA ACCTGATTAA TTAATGCTTC AAATTCATCA TATTTAAGAG GTCTGTTTTC 21300 

AAGCGCTGCA GGAAGTCTAA ACCCAAAGTT AACAAGATTT AATTTTCTAG AATGATC TCC 21360 

ATTATACATT CCCCTAAATT GAGGCAATGT AACATGAGAT TCATCTACAA ATAATAAGTA 21420 

ATCTTTCGGA AAAAAATCAA AAAGACAATA AGGTCTTTCC ATTGTACTTC CACTCAAATA 21480 

TTTAGAATAA TTTTCAATGC CCGAACAAAA CCCTGTTTCT CTAAGCATTT CCAAATCATA 21540 

CTCTACCCTC TGTTTGAGTC TCTCGGCTTC TACAAGTTTG CCATTGTCTT TAAAATATTG 21600 

ACATTGAAGA CTTAAATCAT GAGATATTTT GGGTATCGCT TCTAATACAT TTTCATAAGG 21660 

AATTACAAAA TAAGATTTAG CAAAAAGAGT AAAACTATTT GTAGCTCCTA AATTTTTTTT 21720 

AGAAAATGAA CTAACTCTAT ATATTTCAAC AATTTCATCA AAATCTAAAC AAATTCGATA 21780 

AGCAAACTCT CCATGTTCAC TGCTAGGCCA AATTTCAACA ATATCTCCCT TAATCGAAAA 21840 

TTTATCTCTT TCTAGATTCA TTAAAGTTCT CTCATAATAA AGCTCTACAA AAATATC TGA 21900 

TATTTCTTTA ATAGAAATCT TTTGACCTAC AAAAAATTCT CGTGCTGATT TTTTGAAAAA 21960 

ATCTGGAGAT CCAAGAGCAT AAATTGAAGA TACGGTTGCA ACAACAATTA CATCTCGTCT 22020 

TTTAGCAAGA GACGTTACCG TTCTTATTCG CTTAATTTCT ATCTCAGTAT TAATAGTGGC 22080 

TTCTTTTTCA ATAAATAAAT CTTTTGAAGG AACATAAGAT TCTGGCTGAT AATAATCATA 22140 

ATAAGAAACA AAATACTCAA CAGCATTATT TGGAAAAAAA. TCTTTAAACT CTCTATAAAG 22200 
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CTGTGCTGCT AATGTTTTGT TGTGACTGAC AACTAAGGCA GGCCTGTTTA GATCTTTTAT 22260 

TATATTTGCA ATTGTAAAAG TCTTTCC AC T GCCTGTAACA CCTTTTAAAG TTTGATATTT 2 2320 

ATTTCCAAGC AAAATAGAAT TTTCAATCTC TTTTATTGCC TTAGGCTGAT CCCCAGCAGG 223 80 

AAGATATTCT GACTTCAAAA AAAAATCTAT CATTAATTTA ACGACCAAAA TTTAATACAC 22440 

ATTCTTATAA ATTATATGAT AATAAATTCT ATATCAAGTA TATAATTCAT TATAAATCAA 22 500 

TATAATTTAA TTAATCTTTG TTTAATAAAA TAAAAGGAAA TATTGATGCT AAAAATCGAA 22 560 

GCTAAAAGAA AATTGAAAAA TTATATTCTT CTTGAAGAAG ATATGCATTT TAAAGAAGAA 22 620 

GCAATAAAAA TTCAAAAAAC AAATAATTCA ACAGAAATTT TAAATAGATT TTACAAAGAT 22 680 

CTAGAATTTG GCACTGCTGG AATAAGGGGA ATCATTGGAG CTGGAACATG TTAC ATGAAC 22740 

ACATATAATA TAAAAAAAAT AAGCCAAGGA ATATGCAATT AC AT AC TT AA AATAAACAAA 2 2 800 

AACCCTAAAG TTGCAATAAG CTATGATTCA AGATATTTTT CAAAAGAATT TGCTTACAAT 2 2 860 

GCTGCTCAAA TTTTTGCCTC AAATAATTTT GAAACATATA TATATAAAAG TTTAAGACCT 22 920 

TCCCCACAAC TATCTTATAC AATAAGAAAA TTTGACTGTG ATGCTGGCGT TATGATAACA 22980 

GCAAGTCATA ATTCAAAAGA ATATAATGGA TATAAAGCAT ATTGGAAAGG TGGAATCCAA 23040 

ATAATACCAC CTCATGACAC ACTAATAACT AATGAAATTA AAAATACAAA AAACATAATA 23100 

AATACAATTA C C AT AAAAG A AGGCATTGAA AAAGGGATCA TCAAAGAACT TGGCAATGAA 23160 

ATAGACGAAG AGTATGTGAA AGCAATAAAC AAAGAATTGC CTGATTTTGA AAAGAATAGC 23220 

AAAGAAACAA ACTTAAAAAT AGCCTACACA GCATTACATG GCACCGGTGG GACCATAATA 23280 

AAAAAACTCT TTGCAAATAG CAAAATACGG CTTTTTTTAG AAAAAAATCA AATACTACCA 23340 

AACCC TGAAT TTCCAACAAT AAATTATCCT AATCCAGAAA AACAAACATC AATGCTTAAA 23400 

GTAATAGAGC TTGCAAAAAA AAAAGATTGT GACATTGCCC- TTGCAACAGA TCCAGATGCC 23460 

GACAGAATAG GGATTGCATT TAAAGATCAA AACGAATGGA TATTCTTAAA CGGAAATCAA 23 520 

ATATCATGCA TTTTAATGAA CTATATACTC TCAAAAGAAA AAAATCCTAA AAATACATTT 23 580 

GTAATATCAT CGTTTGTAAC . AACACCAATG CTAGAAAAAA TTGCAAAAAA ATATGGTTCT 23640 

CAAATTTTTA GAACTTACAC AGGATTTAAA TGGATAGGAA GCTTAATTAA TGAAATGGAA 23700 

AAAAATGAAC CAAATAAAAA ATTTGCTTTT GCATGCGAAG AAAGTCATGG ATATCTAATA 23760 

GGAAGAAAGG TTAGAGATAA GGATGCATTT TCAGCCATAA AAGGAATTTG TTCTTTAGCA 23820 

CTTGACTTAA AAGCCAAACA ACAAACAATT AAGGATTATC TTGAAAAGAT ATACAAAGAA 23880 

TTTGGATATT ATGAAGAATT TAATATAGAA AAAAACTTTG AGGGGGCCAA TGGAGAAATT 23 940 



WO 98/58943 PCT/US98/12764 

• m • 

CAAAGAGAAA AGTTAATGCT AAAACTAAGA AAAGAACAAA AAGTACAATT TGCAGGAATT 24000 

AAAATAATTG AAAAATTAGA CTATAAAACT CTTAAAAAGA TTAACTTTAA AAATGAAATT 24060 

TCAGAAATTA AAGAATATAA ATACCCCATA AACGCAATAA AATTTATACT TGAAAACGAA 2412 0 

ATTGCAATAA TTGTAAGACC CTCTGGAACA GAGCCGAAAA TTAAATTTTA CATATCTGTA 24180 

AAACTAGAGT ATAAGGAAAA ACATAAAATA TTTGATATAA TAAATGCAAT AAAGATGGAG 24240 

ATAAAAAAAT ATTAACATAA CAGAAAATTT AATAAATTTG GTAGAAATAG ACTCAAAAGA 24300 

AATTGCAAGA AAAAATAAAA ATAAAGAGGT TTCAATTTGG CACTTATTAA TGTCTATAAT 243 60 

TACCACTCCC AAAAAATCCG AAATAAAATT TATAGATAGC AAAACTCTAA AAAACATTAA 2442 0 

ACAAGAAGTT ATATCTGAAA TAGATAAATT AGAGAAAATT TTAATAGAAA AAAACGAAAT 24480 

AATTATTCCC AAAATCAATA AAGAAATCTT TGCTCTCATA AAAGAAGCTA AAAAGGAATT 24540 

TAAATCCAAA CCTTTAATAG GGGC AAAAG A AATTTTTTAT CAAATATTAA AAAATAAAAA 24600 

ACTTCTTAAA AAACATAAAC TAAGT AAATC TAGCTTTAAC TTTAAAGATC AAAATATATT 24660 

AGAATACATG GAAAAAAATA AAATAAGATT AATTGAAACC TACAAAGAAT TTGATGAAGA 24720 

AATACGACTT GAAAATGAGC ACTTTGAAAT TGGAAAGTAT GTCAAAAATT TAACAGCACT 24780 

TGCAAAAGCC AAAAAATTAG ACCCCTTGGT TGGAAGAGAA GCAGAGATTA AAACTCTTAC 24840 

AAATATACTC TTGAGAAGAA ATAAAAATAG TGCAATGCTA ATAGGCGAAC CTGGTGTGGG 24900 

AAAAACAGCA ATAGTTGAAG GCCTTGCATC AAGCATAGTG CAAAAAAAAA TAAGTAGCAA 24960 

ACTACAAGAC AAAACAATTC TAATGCTTAA GGTTTCAAAC TTGGTATCGG GAACAAAATA 25020 

TAGAGGCGAG TTTGAAGATC GTTTAAATAA TATAATTAAG TATATTGAAA AAAACAAAAA 25080 

CACAATCATA TTTATTGACG AAATACACAC TCTAATAGGA GCTGGAAACT CTGAAGGAGC 25140 

TCTTGATGCA TCAAATATAC TAAAACCATC ACTTTCTAGA GCTGAAATAC AAATTATTGG 25200 

CGCAACTACT TACAATGAAT ATCGAAAATA TATTTCAAAA GACAAAGCAT TCGCCAGAAG 25260 

ATTCCAAACA ATTACCGTAA AAGAGCCTGA TGAAAAAGAT aCACTAAAAA TAATCGAAAA 2532 0 

TATTGCAAAA AATTTTGAAG ACTATCATGG AGTGATCTAT GAAAAAAGCG CGCTTTTAAA 25380 

TATAGTAAAA CTTTGATCCA AATATCTAAT AAATAAAAGA TTTCCAGATA AAGCAATAGA 25440 

TATAATAGAC ATTGCCGGCG CAATTAAAAA GGAAGAACTT ACAAAAGACA ACATCATAAC 255 00 

ATCAGATGAT ATACAAAAGG CAATAAATGA AATATTATCT ATTAAAACAG CAAATAACAC 2556 0 

TAAAGAAGAA ATTTTAGAAT TAAAAGAAAT AGAAAGCGAA ATAAATAAAA AGGTGATCGG 25620 

ACAAAAACAT GCGGTAAGCG AACTTATCAA AGAAATTATT AAAGTCAAAC TTGGACTTAA 25680 

TGACGATTCT AAGCCTTTAA CTTCAATATT GTTAATAGGA TCAAGTGGAT GTGGAAAAAC 25740 
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TGCTTTAACT 
AGATATGTCA 
ATACGTAGGC 
TTTAATATTG 
AATGCTTGAA 
AATTATAATA 
ATTCAACAAA 
AGATCTTGAA 
AAATATCCTT 
AACAAAATTT 
AACCACAAAA 
GAAAATAGAA 
GATTTATTTA 
AAAGTAGAAA 
GACAATAAGA 
TATGACGGAC 
ATAAAAGACA 
GGATGGGATA 
GGAAAATACG 
GTACTTAGAT 
TTTGAAAAGG 
AAAAATCTTT 
AAGCTTGCAA 
GACCCATCAT 
ACAACCACCC 
TATTCTAAAA 
AATAGCTATT 
CTTGAAGGCA 
GCTTTCAAGG 
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GATGAAATAT 

GACTATAAAG 

TACTCTGATG 

TTTGAAAATA 

AACGGAGAAC 

ATGACTACAA 

AATCAACAAA 

AAAAGATTTA 

ACaAAGGAAA 

CACTCTAAAG 

TACTATAAAA 

GAAAATATTA 

GAAAAAGAAA 

ACAAGGCAAA 

TCTTTGAAAA 

CGCCTTTTGC 

TAATTCCAAG 

CTCACGGACT 

AAATAGAAAA 

ATACAGAAGA 

GTTACAAAAC 

ATGAAAAAGG 

CTCCGCTTTC 

TAACAATAAA 

CCTGGACATT 
TTTTTGACAA 

ACGATGATGA 
TAGAATATGA 
TACACACAGC 



CTAAAAAAAT 
AAGAAAACTC 
GAGGCATTCT 
TTGAAAATGC 
TTATTGACAG 
ACATTGGATC 
AAAGCTTAGA 
AATTATCCTT 
ATGTAGAAGA 
GAATCGAGAT 
AAAATTCAGG 
TCACCAAAAT 
AAATAATAAT 
TTTTCCTAAA 
ATCAATAAAG 
AACAGGACTT 
ATATCAAACA 
ACCTGTTGAA 
TTATGGCATT 
ATGGAAAAAT 
CATGGATATA 
TTTAATCTAC 
AAATTTCGAA 
ATTTAAAATA 
GCCCTCAAAC 
AACAAAAGAA 
AAATTCATAT 
ACCTATTTTT 
TGATTATGTT 



172 
TATCAAAGAT 
TATTTCAAAA 
GACAAATAAA 
CCACAGCTCT 
CAAAGAAGAT 
TAGAATGCTT 
AACAAAAAGC 
TTTAGACAGA 
AATTTGCAAA 
AGAAATAAAA 
AGCAAGAAGC 
AGCTGAAAAT 
AGAATAAAGA 
ATAGAAGAAA 
CAGAGAGAAG 
CCTCATTTTG 
ATGCAAGGCA 
TACGAAGTAG 
GAAAATTTTA 
ATAATCTTGA 
AGCTTCATGG 
GAAAGTTACT 
GTGAATCTTG 
AAAGATAAAA 
CTTGGAATTG 
GAGATTTTAA 
ACTATTATAG 
AACTACTTTT 
ACAACTGACG 



CAAAATTCAG 
TTAATTGGCA 
TTAAGACATT 
GTATTAAACC 
AAAAT AC TAT 
C TTGG AG AAA 
TTTAAAGAAG 
ATTCAAAAAA 
AACTACTTAA 
AAAGATGTTG 
GTAATTGCTG 
CAAAACATAA 
GGAATTATAA 
AAATATTAAA 
GATGTGAAGA 
GACATTTTGT 
AGTATGTTAA 
AAAAAAAATT 
ACAAAGAATG 
GAC TTGGACG 
AATCCGTGTG 
ATGTACTACC 
GAGAATATAA 
ACGAATACTT 
CAGTAGGACA 
TACTTGGATC 
AAAAATTCAA 
TAGAACAAAA 
ATGGAACAGG 



TATTAAAACT 

GAAATCCAGG 

CATTTGAAAC 

TAATAAGTCG 

TTAAAAACAC 

AAAAT ATTGG 

AAATAAACCA 

AAATCATCCT 

ACACCCTTAA 

ACAAATTCAT 

CAATAAAGGG 

ATAAAATAAC 

TATGTTTAAA 

ATTTTGGAAT 

ATTTACATTT 

TCCAAACACA 

AAGAAATTTT 

GGGAATTTCT . 

CAGAAAAATA 

ATGGGTAGAT 

GTGGGTATTT 

CTATTCCCCA 

AGAAGTCAAT 

ACTAGTGTGG 

AGAAATAGAA 

AAAAAAGCTT 

AGGCAGCAAG 

AGATAAGGGG 

AATTGTTCAT 



25800 
25860 
25920 
25980 
26040 
26100 
26160 
26220 
26,280 
26340 
26400 
26460 
26520 
26580 
26640 
26700 
26760 
26820 
26880 
2 6940 
27000 
27060 
27120 
27180 
27240 
27300 
27360 
27420 
27480' 
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ATTGCTCCTT TTGGAGAAGA AGACTACAGA ATACTCAAAA AACACACAAA TGTCGATATA 27540 

ATAGACCCCT TAGATGCTGA ATGTAAATTT ACAAATCAGG TAAAAGATTT TAAAGGACTT 27600 

TTTGTAAAAG ATGCTGATAA AAAAATAATA GAAAACCTAA AATTACGCAA TTTTTTATTC 27660 

AAAAGAGAAA ATTATCTACA CAGGTATCCA TTTTGTTATA GAACAAACTG CCCAATTATT 27720 

TACAGACCAA TAAGTTCGTG GTTTGTAAAT GTAGAAAAAA TAAAAACCAA ACTTTTAGAG 27780 

GTAAATGAAA AAATTAATTG GATGCCAGCC CATTTAAAAA AAGGAAGATT TGGAAAATGG 27840 

TTAGAAAATG CAAAAGATTG GGCAATAAGC AGAAACAGAT .TTTGGGGAAA TCCAATTCCA 27900 

ATTTGGATAT GCTCAAAAAC AGGAAAAAAA ATTTGCATTG GATCAAAAAA AGAGCTTGAA 27960 

AACCTATCTG GCCAAAAAAT CGAAGACTTA CATAAAGACC AAATAGATAA AATAACCTGG 28020 

CCAAGCAAAG ACGGTGGCAA ATTTATCAGA ACAAGCGAGG TTCTCGATTG TTGGTTTGAA 28080 

TCTGGAGCAA TGCCTTACGC AAGCAACCAT TATCCATTCA CAAATGAAAT TAATTTTAAA 2814 0 

AATATATTTC CTGCTGACTT TATTGCAGAA GGTCTAGATC AAACAAGAGG ATGGTTTTAT 28200 

ACTCTTACAA TCCTGGGAAC TGCTCTTTTT GAAAACACAG CATTCAAAAA CGTTATTGTA 282 60 

AATGGACTTG TGCTTTCAAG CGATGGAAGA AAAATGTCAA AATCCTTTAA AAATTATACA 28320 

GACCCAATGC AAGTAATAAA CACCTTCGGA GCTGATGCTT TAAGGCTTTA TTTAATAATG 28380 

AGCCCTGTAG TTAAAGCTGA TGATTTAAAA TATAGCGACA ATGGAGTAAG AGACGTTCTT 28440 

AAAAATATAA TAAT AC CC AT TTGGAACGCT TATTCATTTT TCACAACTTA TGCAATAATT 28500 

GATAAATTCA AACCTCCAAA AAATCTCAGC CTGGCTAAAA ACAATAACCT TGAC|^\ATGG. 28560 

ATCATAAGCG AACTTGAAAG TCTAAAAAAA ATACTAAATA CAGAAATAGA CAAATACAAT 2 8620 

CTAACAAAAT CAATAGAATC TTTACTTGAA TTTATAGATA AATTAAACAA TTGGTACATA 2 8680 

AGAAGATCAA GGCGAAGATT TTGGAAATCA GAAAACGATA AAGACAAAAA TGATGCCTAC 28740 

GAAACATTAT ATTATGCAAT CAAAACTTTA ATGATTTTAC TTGCACCTTT TATTCCATTT 28800 

ATAACAGAAG AGATTTATCA AAATTTAAAA ACTGATGAAG ACAAACAATC AATACACCTT 28860 

AACGATTATC CAAAAGCAAA TGAAAATTTC ATTAACAAAA CAATTGAAGA GAAAATAAAT 28920 

CTCGCAAGAA AAATAACTTC AATGGCAAGA TCACTCAGAT CATTGCACAA TATAAAAATA 28980 

CGCATGCCTA TTAGTACGAT ATATATCGTC ACAAAAAATC AAAATGAACA AAATATGCTA 29040 

ATGGAAATGC AAGAAATAAT ATTAGATGAA ATAAATGCAA AAGAAATGAA AATAAAAGCT 29100 

AACGAAGAGG AGCTTATAAC TTACAAAGCA AAAGCAAACT TTAAAGAACT TGGGAAAAAG 29160 

CTTGGAAAAG ATATGAAAGC GGTATCTACT GAAATTAGCA AGCTAAAAAA TGAAGACATA 29220 

ATAAAAATAA TAAATGGAAC ATCCTACGAG ATAAAAGTAG CCAATGCAAA GCATTATTTA 29280 
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TCATTAAATG ATATAATATT AGAAAGAGAA GAAAAAGAGA ACTTAAAAGT AATAAATGAA 29340 

GAATCCATTA CAATAGGAAT AGACTCACTA ATCACTAAAG AGTTGTACTT GGAAGGGCTG 29400 

ACAAGAGAAT TTGTAAGGCA AATACAAAAT TTAAGAAAAG AAAAAAATTT TGATGTTAGC 29460 

GATAGAATAA ATTTATACAT AGAAAATAAT GAAACTTTGA AAGAAATGCT AAATAAATTT 29520 

GAAAAATACA TTAAAACTGA AACATTAGCC TTAAATATCA TATTAAACAA AAGTAAGCTA 29580 

GAAAAAAAAA TAAACCTTGC CGATGACATA TTTACACTAA TAGGAATTGA AAAATGTTAA 29640 

AAACATTAAC AAAAATAATT ACCATTTCAT GCCTCATAGT GGGATGCGCA AGCCTGCCTT 29700 

ACACTCCTCC AAAACAAAAT CTAAATTACT TAATGGAACT TTTACCTGGC GCAAATTTAT 29760 

ACGCCCATGT AAATTTAATT AAAAACAGGT CTATTTATAA CTCTTTAAGC CCTAAATATA 29820 

AATCAGTTCT TGGGCTTATA AGCAATTTAT ACTTTAGCTA TAAAAAAGAA AATAACGATT 29880 
TTGCTCTACT AATAATGGGT AATTTCCCAA AAGATATTTT CTGGGGAATT CATAAAAATA , 2-9940 

GAAATACAGA" ATCAATAGGC AATATATTTA CAAATCCAAA ATGGAAACTT AAAAATTCAA 3 0000 

ATATATACAT TATTCCAAAC AAAGCTAGAA CTAGCATTGC AATAACCCAA AAAGATATAA 30060 

CCGCAAAAGA CAATAATATG CTAACAACAA AATATATTGG GGAAATAGAA AAAAATGAAA 30120 

TGTTTTTTTG GATTCAAGAT CCAACATTAT TGCTCCCAAA CCAAATAGTA AGCAGCAAAA 30180 

ATTTAATTCC CTTTAGCAGT GGAACTTTGT CTATAAACAG CTTAAATCAA GAAGAATATA 30240 

TTTTTAAATC CTTAATCAAA ACAAATAATC CACCAATACT AAAAATATTG TCAAAAAAGT 30300 

TAATTCC AAC CGTCTTGACA AACATGACAA ACCTCACAAT ATCAAGCCAC ATAAAGACCA 30360 

CAATAAAAGA CCAAAATACG GTTGAAATAG AATTTAATAT TCAAAAATCT AGTGTTGAAA 30420 

GCCTTATAGA AAAACTAGCT TCAAATATTC AAACCTAAAA TTTCTGCCAC TCCACTAAAA 30480 

. TGAGGTATTA TTTTGATTTT TGCAAGTAAA ATAATGAAAC AAAGCTCCAA TTTTACCAGA 30540 

TTCATTATTA AATTTAGTAG GCTCAAGTGC TACAAGATTT TTTATATTAT TATTATTATC 30600 

AAAAGCCATT TCTAAAGACC ATAAATTTTC TAATTTTTCA TATATTCTAT CTATTAAATC 30660 

GGGTCTTGCG CTTATTCCTC CTCCGATCAA AATTTTTTCA GGATTCAAAA TAAAAGTTAA 30720 

ATTAAAAATA CCAAATGACA AATTCTCAAA AAATCTATCA ACTTCATTTT TGGCATGAAT 30780 

ATTCCCATTC TCAGCTAGAT CAAAAACAAA TTCTCCTGAA ACCTCTTTTA AAGGTTTTCC 30840 

TAATCGCATA GCAACTCTTT TTCTTAAAGC CGAAACAGAC GCAATGGATT CCCATTTGCA 30900 

ATTAAAGGGA ATATTGTTGC TAATACCTCC AGTAATCATA AATCCAACCT CTCCGGACAT 30960 

AAAAGAATTT CCTCTTAAAA GCTTGCCATT TGCAAAAATT CCAGCACCAA TTCCTGTGCC 31020 
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AAGAGTTATA GCAATAAAAT 
GGCTACACAA TTAGCATCAT 
CTCTTTTAAA GGATAATTAA 
AAGATCAACA AACCCAGGAA 
AGAATTAATA ATATTAACTA 
TTCATTTTTA TCAAAAAAAA 
GCCAATATCA ATCGCTAAAT 
TGATCATAAA TTTTCTATAG 
CGCTTTAATC ATCTCCTTCA 
CATCATTTGA AACTTAGAAA 
GCCAAAACCT TCATCTAGAA 
AGATAAAGCT AAGGAC AAAG 
TACCGTTCTT ATTTTATTAA 
CTCTTTGTTC GTTTTGAGCT 
CAGTCTTAAA ATATCATTAA 
AATAACGACC TTCCTTAATA 
ACCCCTTAGC GAGTCTAAAT 
ATTTAAAAGC TTGATTTTAT 
AATTGATAAA CTCAATTTTT 
AGTTGTTGAT TCAAAAGAAA 
AATAAAATTA TTTTGCTCTT 
AAATTTAATT TGTGTTTCGG 
AAGCATATTC CATTCATTTT 
TTCCAAAGAA GAATAATCAT. 
AATCTTTAAA AGATTTTGAT 
TTTTTTCTTA GCCTTAAATT 
TTCTCTGTCA AAATAATTTA 
TTCAGCATTA TTCTTTTCAA 
TCTACAAGAT AGCTGATATT 
ATTTAAAGCG TAAAGCTCTT 



TATTAGAGTC AATAGCATTA 
TTTCAATCTC TGTACTTACT 
CAAATCCAGA AATAGCATTT 
TACAAATTGC AACTCCCGCA 
AAATATTTAC TTGTTCGTCA 
CACCGCTTGA ATCTGAAAGC 
AATGTTTCAT ATTTATCCTC 
CAATATAAGA AATTTTAGAA 
AATAAGAAAC ATGGGAAATT 
GCTTAGGCAT AACTTGAGCC 
AAAAAGCCTC TATTTTTAAC 
C TAAAGAC AC AAGAAATTTC 
CATCTTTTTT GTCTTC AATT 
CAAAATCAGG AAAAATCCAC 
TTAAAAAAGT TTGAACATAA 
CATCTAGCTT ATCTTTCCTT 
TAATTTTTTG TTGATTAATC 
ACTTTTCAAT ATCTCTGGAT 
GCAAAAAAGA CAGACTATTC 
ATGAACTAAA AAAAACATTT 
CATTTAATCT CGCTTTTAAA 
TTTTAATTTT TAAATCTTCT 
CCACACTTTT CTGCTTAGCC 
TAAAACTTAA ATTTAAATTA 
CAAAATTTTT ATTTTTCAAA 
GTTCAAGCTT TTCTAATTTA 
TGTATTTGTC AAATAAATTT 
ACTCCAAAGC ATTAACCTCT 
TAAGCTCATC AATTTGATTT 
TCAAACTGCA AAGCGATAAT 
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CCCTTAAATT TTTCTGCTAA 31080 

CCGGTTAAAG ATTC TAATCG 31140 

ACCCTAAGAA CATTTCCCTT 312 00 

ATATCACTTG ATTC TTTGTA 31260 

GAAGTAGCAC CTGTGCTTAT 31320 

GAATATTTGG TACTAGTCCC 313 80 

AAGGCCTAAT TGTACATAAA 31440 

ATCTTGTTTA TAACTATTTG 31500 

ATGC C AATTT GTCGCCCAGT 31560 

AAAGTATCTT CATCAAGATT 31620 

TCACTATCCC TTATTTTATC 31680 

TCACCCCCAG ACAAAGTTTT 3174 0 
AAAAAATCAA ACTCTTTGCT . 31800 

CTTAAATACT TTTCATTTGC 3186 0 

TATTTCAATC CAGAAGATCT 31920 

TCTTTAGCAA GATTTAATTC 31980 

TTTTTTTGAA GAGTTTGAAA 32 040 

AAAAATTCAA GCTTAGAACT 32100 

TTATTAATTT GCTCTAAGTC 32160 

TTTAAATTTG AAATTAGATT 32220 

GTCAATATAG ATTCCTTTGT 32280 

AAATTTTTAA GATTTAAAAC 32340 

AAAACAACAT TAAATTCCCT 32 400 

AGTTTTAACA ATAAATCCTT 3 2460 

GAAATTTCAA TTTTTAAATC 3 2 520 

TTTTCAAATG CTAAAATTTT 32580 

TTTCCAATCA ATCTTAAAAT 32 640 

TGTTTAGAGA TTTCTTCTTG 32700 

TGTATCAAAT GAAGCTTTGA 327 60 

TTATCTTTAT GCTGATAATT 32820 
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TTTATATTCA 


GCCTCAATAT 


ATTTTAGCTT 


CTCTTTATCA 


CTCTCAATAG 


AAAAATTTTT 


32880 


ATCATCAAGA 


TATTTTAACA 


ACTCTTTATA 


CAAGTC AATC 


TTAATAATAT 




32940 


ATTTTTATTA 


AAATCTTCTT 


TGCCAGTAGA 


TTTTAATAAA 


AATTCTAGCC 


TATCTCTATA 


33000 


TTTTGAAATC 


AACTCATCAT 


TAAAAGTTTG 


GAACAATTTT 


AATGCTTCAT 


AGTAAACATA 


33060 


TTTGTCAAAA 


TCAAAATTAG 


ATTTCTCAGA 


AGAAATGCTT 


TT AATC TC AT 




33120 


GCTCTGAGAT 


CTTAATAATT 


CTTTTTTTTC 


ATCCTCAAGA 


CTATTTTTCC 


TTAAAAGCAA 


33180 


GCTTTGATAA 


TCATTCTCAA 


CAAGTTTTAA ATTAAAAAGA 


TTGCAATTTT 


TATTATAAAG 


33240 


CTCCTTAACA 


TAATTAAAAT 


TAAAATCATT 


GCTATACAAA 


TCTTTAATTT 


CTTCTAAATT 


33300 


ATTCATCACT 


TTTGAAAGTT 


CTAAATCAAG 


CAATCCAATA 


TGATTAAACA 


ATTCGCCTTG 


33360 


TACAGCAACT 


AAATTTTTCA 


AACTCCAAAA 


ATCTGAACAT 


AAATAAACTT 


TTCGATCCAA 


33420 


atCcaaattt 


TCTTTTAATT 


TCTTTTGAAG 


GGAATGATCC 


TTCTCTAAAG 


AATTTAAATA 


33480 


TTTAATCTGA 


GAAGATAATT 


GATCGTTTAG 


AGAAGACATT 


TCAATTTCCA 


AGCCCAAATA 


33540 


TCTCTCATTA 


GACGCTATTG 


CCTGATTACA 


TAAAGAAATA 


GCTCTTCTAA 


TATTTTCAAG 


33600 


ATCAATTTCT 


AATCGATCAA 


TATCAACCAA 


ATCCAAATAA 


CCTTTAAGGG 


ATTTACACTC 


33660 


ACTCTCATCA 


TAATCAAGAA 


TTGATTTTTC 


ATAAGATTCA 


GAATTCAACA 


ATTTATCTAT 


33720 


ATTAAACTTT 


GTGCGCTCAA 


AATCACTTTT 


TAAATAAAAT 


TCCAAATTAT 


CATATTTTTT 


33780 


CAAATTAAAA 


ATATTATCAA 


TTATTGCAGC 


TTTCTCTTTG 


GGAGTTGACG 


TTAAAAATTC 


33840 


TTGAAAATTG 


CCTTGTGGTA AAATTACAGT 


TTGACAAAAT 


TGGTTAAAGT 


CTAATCGACA 


33900 


AAGACTTTTA 


ATATGCTCTA 


AAACATCGGT 


TCGACCCTCT 


ATAATCCTAT 


TATCAAAAAA 


33960 


ACAATTAAGC 


AACATGCTCT 


TAGGAGTCTC 


TATATTTTTT 


ACATTAAGCT 


CAACAAAAGA 


34020 


TTCATAAATC TTCCCAGAAA TAGTAAACGT TAATTTAACA 


TAAGCGCTAG 


TCTCGCCTTT 


34080 


TGATATAATA 


TCTACAATTT 


TTTTTCCAAG 


TCTGTAAACA 


CGAGCATATA 


GTGCCAAAGT 


34140 


TATGCAATCT 


AAAATGGTGC 


TTTTGCCTGA 


TCCAGTATTA 


CCAGAAATTA AAAAAATGCC 


34200 


CGATTGTCTT 


AAAAGAAGTG 


TATCGAAATT 


CAACTCATGT 


TCGCC TTTAT 


AAGAAGCAAT 


34260 


ATTTTTAAAT 


ATGAGCTTAT 


TTATCCTCAT 


ATTCACCCAA 


ATATCCGTTA 


GCTAAAACCT 


34320 


CGTTAAAAAG 


AGAAATAAGC 


TCTTCTTCCT 


TAAATTTGAT 


ATCCCTAATA 


ACACCGTTCT 


34380 


CAAAATCCCA 


TCTCAACTTT 


TTCTCAAAAA 


AATATTTTTC 


ATCCATTTCA 


AGC AC TTCAA 


34440 


GTTCTCCAAT 


AAAGTTGGAA 


TCGTCTTGCA 


AATCCTGACT 


TGAGGGTAAA 


GAATAGGAAA 


34500 


TAGAAACTAA 


ATTCATAAAA 


TTAAGCCTTG 


CTAAATCATA 


AATAGACTCC 


TCAGCGCTAG 


34560 
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TATCAACTGC CTCATTAAGT TCAATTTTCA AATAAATAGT AAAAGACTCT TCTTTTTTGG 34620 

AATTAGCCAA AAAATCAAGA ACTTCATTTA AAGAACCTTT AGCAAAGATT AATTTATTGA 34680 

AAATCGGCAC CGGAAATGCT TCTTGCAAGA TTAATTTATT GTCATTAAAA TGTAAAACGT .34740 

TTATGTATTT ATCACAGGTC TCATTAAATG AATATTGCAT AGGAGATCCT GAATAAACAA 34800 

TATTATCTCT TAGTTTCATG AACTTATGAA TATGCCCAAG AGCAACATAA GAAAAACCAT 34860 

TTCCAAAAAC ATTAAAAGGG ATAATATAAC TACCTCCCAA GGTGTCAATC TTTTTACTGC 34920 

TGCCAAAAAA AGAATGCGCC ATTAATATCT TAGGAATTCC TTTATACTTG TTTTCTAAAA 34980 

AATTAGATAA ATTTGATATT TTTTCTCTGT AGGCATTTTC TAAATTTTCA AGAAATAGTT 35040 

TGCTGGAATA CTGATCTTCC AGTCCAAAAA TATTGTCAAA ATTTTGACCT AAAATAAGCC 35100 

TTTCATTTAT ATGCGGAAGA CAAACAACAA TAAAC TTAAG ATTTCCATTA TCTTTTAATA 35160 

AAACTATTTG CTCATCAGAA TCATATTCAG TTATTAAAAA AAAATTAAAC CGTGAGAGAA 35220 

GTTTTTTATT TATACTCAAA TAATCCTTTT TGTCATGATT TCCAGAAATA ACCACACACC 35280 

ATTTACAAGA AGTAAAAGAA AGTTCATAAA AAAAATTATT CACTAATCTT TGCTCTTCAA 35340 

ACCCAGGCCT TTTGGAATCA TAAACATCCC CGGCAACAAG TAAAAGATCT ATATTTTCTT 35400 

TTTTAATAAA TTCTAAAAGA AAATATAAAA AATTTTTCTG CTCCTTAAGA ATTGAAAAAT 35460 

TTTCAATTTT TTTTCCAATG TGCCAATCTG AAGTATGCAG AATTTTATAA TTGCTCACAA 35520 

AATACTCTCA CTTTTTTTAA TTCTTAAATT ATATTTATAT ATTATAATAC AATATATAAA 35580 

CATAGGGAAT TTATGCAAAA TAAAAAGTTG ATAATAGTTG AATCGCCAAC AAAAGCCAAA 35640 

ACAATAAAGA AGTTTC TCGA TGAATCATTT CTAGTAGAAG CATGCATTGG ACATGTAGTA 3 5700 
GATCTACCAA ACAACGCAAA AGAAATCCCA AAAGAATATA AAAAATACGA ATGGGCAA^T - 35760 

ATTTCTATAG ATTATAACAA TGGATTTAAT CCAATTTACA TTATTCCCAG . CAATAAAAAA 3 5820 
CCAATTGTAT CAAAACTAAA AAAATTAGTA AAAACAATAA ATGAAATATA TCTTGCAACC . 35880 

GACCAAGACA GAGAAGGAGA AACTATAGCA TTTCACTTAA ' AAG AAGTATT AAAAATCAAA 35940 

AACTACAAAC GGATGATATT TCATGAAATC ACAGAAACCG CAATAACTGA ATCACTAAAA 3 6000 

AATACTAGAA ATATAGACAT GAACCTTGTT AATGCCGGGG AAGCTAGAAG AATATTGGAC 3 6060 

CGACTATACG GGTATACAAT CTCTCCACTA CTTTGGAAAA AAGTAGCTTA TGGACTTTCT 3612 0 

GCTGGGCGAG TACAATCTGT TGGATTAAAA TTATTAATAG AGAAAGAAAA AACTAGAATA 36180 

AATTTCAAAA AGGCAAATTA TTATTCAATT TTACTTCAAT GTAAACACGA GAAAAAAAAC 36240 

TTGTTGCTTG AAGCAAAATT AGAAGAAATT GACGGCAAAA ATATAGCAGA GGGTAAAGAC 3 6300 

TTTGTAAATG AAACTGGAAA ACTTAAAAAT ATTGCCAAAA CAACAATAAT AACCCAAGAT 36360 
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TTAATGATAG AGCTTG AAAA AGAATTAAAA AATGGACAAA AAATTGAATT AATTTCAATA 36420 

GAAACTAAAA AAATAAAAAT ACCTCCTCCA AAGCCATTTA CCACCTCTAC ACTTCAACAA 36480 

GAAATAAATA AGCGTCTTAA AATTGGAACA AAGCAAATCA TGCAACACGC TCAAAAACTT 36540 

TACGAACACG GATACATTAC CTATATGAGA ACAGACTCTC ATAATATTGC TAAAATTGCA 3 6600 

AAAGATAAAA TAACAAAAAT AATAAAAAAT AAATATGGGA AAGAGTATAT AGAGGAAAAA 3 6660 

GATAGAATTT ATGAAAAAGA AAAAATGGCT CAAAATGCAC ATGAGGCAAT AAGGCCTTCT 3 6720 

GAAATATTTA TTCCAAATGA AACCATAGAA ATAGAAAGCA AAACCGCTAA AGAAATTTAC 3 6780 

AAAATAATAT GGGATAGAAC CATTATTTCT GGAATGAAAG ATGCAATAAA AGAAAATATA 36840 

AAACTGACTT TTAAATATAA AAACTTAATT TTCAGATCAA GTTTTACAAA AATAATTTTT 3 6900 

GATGGATTTC TTAAACACAC TAAAGAACAA GATGAACATC TTAACATAAA TTTTGACTTA 36960 

ATTAAAAAGG GAGATACATT TTCCATAGTT AAAATGAAAA CAAGTGAGCA CGAAACAAAG 37020 

GCTCCATTTA GATACACAGA AGCGTCTCTT GTGCAAAAAA TGGAAAAAGA AGGAATAGGT 37080 

CGTCCCTCGA CCTATTCTAC AATTATATCA ACACTTTTAG AAAGAGAATA TGCATTCAAA 37140 

CTTAACAACA CATTAATGCC AACTATAAAA GGCGCTGCTG TAATAAATCT TCTTGAGAAA 37200 

TATTTTCCAG TACTCATTGA ACTAAATTTC ACCTCTAATA TGGAAGAAAA ATTAGACAAA 37260 

ATAGCAATAG GAAAACTAGA TAAAATAAAA TATCTAAGTA AATTTTATAA TGGCAAAAAA 37320 

GGACTAAAAG ATACAGTAAT GCAACTAGAG CCTAAAATTG ATTCCTCTGA ATTTAGAACC 37380 

GTTATTGAAA GTCAAAAAAT AGAAAATAAA AATAGCATTA ATTACACAAT AAACATTGGT 37440 

AAATATGGGC CTTATTTGAT ATTCAAAGGA CATAATTACT CAATTAATGC AAAAACTCCA 37500 

TTAGAAAATT TGTACAAAAA AGATGAAATA GAAAAAATAA TAAATGAAAA AGAGCTAAAA • 37560 

CCCAATATAC TTGGGGTTGA TCCTTTAACA GGACTTAATG TGATCTTTAA AAATACAATT 37620 

TACGGAAACA TTGTTCAACT TGGAGAAGAT ACCCATGCCC CTCAAGAATA TACAAAAAAA 37680 

GGAAAACCTA AAAAATTAAA AATAATAAAA GCAAAAAAAG CATCAACTAA AAAAATTGAC 37740 

CCTGAAAACA TAACATTGGA GCTTGCTTTA AAATTGCTCT CACTGCCAAA ACCAATTGGC 37800 

AAACATCCCC AAACCAATGA ACAAATCATT GCTGCAACTG GTGTTTTTGG GGATTATATT 37860 

AAAACTGAAA GCGGAAGCAT TGCTTGCTCG CTAAAAAAAG ATTTAAAAGC ATATGACATA 37920 

ACACTAGACA AGGCCATCAG CCTACTCAAC GAAAGAGCCA ATAAAGTGGG TATAATCGTT 37980 

AAAACAATCA CATTTTCTAA AAACAAAATT GGCAACAAAA TATATATTTA CAAAAAAAAC 38040 

GACAAATTTT ATGCTAAAAT TAAAAGAAAG AAGATTGATT TACCTGATAA CATTAATCTT 38100 
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GAAGAAATAA ATGAGAAATA 
CAATTTATAA ATACAAAGAT 
TAGAAAGTCC AACAGGTAGC 
GTTTTGCAAA ATT AGGAAAA 
TAGCTGAATA TATTGCCAAG 
TAAGATTTGA AGAAATTACA 
TTCTGCAAGA GCTAAAAAAA 
AAGCACACGA AAGAAGTTTA 
GGAAAAGGGA TGATTTTAAA 
CAAAATATTT TAATAATGCA 
TAATATACAA TCCTCCTCTT 
TTGTCTTAAA * CGTAATAAAA 
AGAAAGAAAT AAAAGAAACT 
TAATATTTCC TTTATACGGC 
CTCCTAAAAA TAAAAGAAAA 
TTGAAAATAT TAAAATAGTA 
AAACTCATAC CTATTCGCTC 
CTGGTCGAGC AGGAAGACTT 
ATCAATTAAG AGAAGATTAT 
TGTTGAGAAT GGCAGATATT 
CATCAACGCA TTCGATTCAA 
ATAAAAACGA ACTTACAGAA 
ATTCAAGAGC ATTAGTCGAA 
TAGGTCTATC ATTTTTATCC 
AAGCTAGACA AGCTCACTTA 
ATATC TTTG A AGATTTTAAA 
TAGATCTACA AGGACTTGAA 
GCAAATTAAA TATACCAATA 
CAATAATGAG AGGAATGAGG 
CCATCAAGGC TCAAAACGTA 
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TGTATTCAGC TTGTTATAAA 

GAATTAATTA AAGTACTAAA 
GGAAAAACCA CCCAACTACC 
ATTGGAGTAA CTCAACCAAG 
CATATTGGCG TAAATGTTGG 
AGCCCAAAAA CCAAAATCAA 
GATACACTGC TTTATGAATA 
AACATTGATT TTATATTGGG 
ATCATAGTTT CGTCTGCTAC 
CCGGTTGTTA GTATTGAAAC 
TTAAACACAT CAAAAGGAAT 
GAAAAAAAAG CGGGAGATAT 
ATAAAAGAAT TACAAGAATT 
AGAATGCCCA AAGAAGCTCA 
ATAATAGTGT CAACAAACAT 
ATAGATAGTG GAAAAGTTAA 
JCAGGAAGTTC CAATTTCAAA 
TCAAAAGGAA CTTGCTACAG 
CAAAAAGAAG AAATATATAG 
GGAATTAGAG ATTTTACCCA 
ACTGCAAGCA AAATATTAAA 
ATTGGGAAAT ATATGATACT 
GCAATGATAA ATTACCCACA 
ACAAGTGGAA TTTTTCTACT 
AAATATAAAA ATCCAATGGG 
AAAGCTCTAA ATAAAGAAGC 
GAGATAGCAA ATGTGCAAAT 
ATACAAAAAG GTGTTTTTGA 
GATTATATTT GCTTTAAAAC 
ATAATTCATC CTGGATCACT 
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TATGAATGAT TTCAAACTCC 38160 

AAACCACAAT GTTTTAATTG 38220 

AAGAATAATA TATGAAGCGG 38280 

AAGAATAGCT ACAGTATCAA 3 8340 

AGAAGAAGTT GGCTATAAGA 3 8400 

ATTAATGACT GACGGAGTGC 3 8460 

TGATGTAATA ATAATAGACG 3 8520 

TCTTATCAAA GACATTTCAA 3 8580 

AATAAACACA AAAATATTTT 3 8640 

TATCACTTAC CCAGTACAAA 38700 

GATATTAAAA AT AAAAG AAA 3 8760 

TCTTATATTT TTATCTGGAG 3 8820 

AAACTCAAAA AAAAATTTAA 38880 

AGAGCAAATA TTTATGACTA 3 8940 
AGCAGAAACT TCAATCACAA * 39000 

AACAAATAAA TTCCAAACAA 39060 

ATCATCAGCA ACTCAAAGAG 3 9120 

ACTTTACAAA AGAGAAGATT 3 9180 

AACAGACCTA TCTGAGGTAG 3 9240 

CTTTGACTTT ATCTCAAAAC 39300 

ATCTCTGGAT GCTATAAACA 39360 

ATTCCCATTA ATACCAGCAC 39420 

AGCGATCTAT CAAACCACAA 3 9480 
ACCCCAAAAT GAAGAAATGG - 3 9540 

AGATTTAATT GGGTTTGTTA 39600 

TTTCACAAAG GAAAATTATT 39660 

GCAGCTTGAA AACATTATTA 3 972 0 

CAACGAAGGA TATTTAAAAT 39780 

TTCAAAAAAG AAATATAAAA 39840 

TATTAGCACC GATTCTGTGA 39900 
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AATATTTTGT TGCAGGAGAA ATTATAGAAA CTACAAAAAT GTATGCAAGA TCTATTGGTG 39960 

TCTTAAAAAA AGAATGGATT GATGACATTA TCCTTAATGA AGAGTTTAAA CATAACGACA '40020 

TATCTAGCAA AGAGAACCAA ATAACAAATA CCGGGCAGAC AAAAATTATC AATGAAATCA 40080 

AAATAGGGAA AAAAATTTTC AAAGCGGAAT ACAAAAATAA CATTTATGTA ATAAAAATTA 40140 

ACCTAGAAAC GCTAAAAGAA ATAATTTTTA AAAACGAACT AAACAATCAA AATAATGAAG 40200 

ATCTCAAAAA AATTAAAATA CAATTGATGC ATAAAAATAT AACGGTTTTT AACAACAAAA 402 60 

AATTTTTAGA AACTATAGAA ATAGTCAAAA ACATGGGAAA AGATTGGCAT TGTATAAAAA 40320 

AATATGAAAC AAAGAATGTA AACATTGACG AACCTGAAAA AATGAAAAAT CTTTTAGAAT 40380 

GCACAATGCA ATTTATAAGC TTTCCCCCCA AAAAAAACGC TCTTTTTTTA TCGTTGGAAA 40440 

CAGATTATTC TGGAAATTTT AGACTAAAAC CCAAACAAAA TTTCATAATG GCAATAGAAG 40500 

AATCTATAGA AAGCATAAAA AGCCTTATAG AAAACAAAGA ATACATACAA- AAGTTACATT .40560 

TTATAAAAAA ATTAATAAAT AAGGTTTACA AAAAATTAAA TTACTTTTTT TAAAAACTAA 40620 

ACTTTGAAAG CCTTGTTATA ATATAAAATA TAATAATCAA ATATTATTCA AAGTTAACAG 40680 

CAATGAAGTT TATAATAAAT TATGAACTGG CTATCCTTTT TTTATGTTTT ATTATTTTTA 40740 

TTAATTTTTC CTTTTGAATT ACAGAGTAAT AATAAAGAAA ATATAGAAAA TTTAATAAAG 40800 

CTACATATGC TTTATGATTT AACCAATAAC CTGTCAAAAG AATTAGAAAC AATAAATAAA 40860 

ATTAAAAATT TTGACTTAGA ACAACATTAT CTGCTAATTA CAAAATATTA TCTAAAAATA 40920 

AAAAAATATA ^AAGAAGCTAA TGATTTTTTA AAAAAAATAA ACCAAAAAAA GATCAAAAAT 40980 

CAAAAAATAA AAAACGAAAT CATTTCGCTA AAATTAAGAA TAAATGAAGA TAATATTAAT 41040 

GAAGAAGAAA TCAAAAAAAT TTTAAATAAC GAAAAAAATA TAGATGTCAA AATAATTTAT 41100 

CAAATATTCA GTCTTATAAA ATTTAAAAAT AAAAAATT AG CAAATAAAAT TAAAAACATA , 41160 

ATACTAACAA ACTATCCCAA AAGCATTTAT TCTTATAAAA TAAAAAGAAA TGAATAAAAA 41220 

AATATTAACA CTGCTAGTAT TGATTTTAAG TATTTCATCA GT AC TAATGC TGTCCAAATC 41280 

AATCACCAAA AAATCCAAAT ACAAAATTAT TAGGGATTAT TTCATAAACA GCAATTATGT 41340 

TCTGGTGAAA ATTGAAAATA AAGATCTAAA ATTTACCATA TCAAAACCTA TTTACGACAA 41400 

AAAGC TAAAT AATTACTTCT TTAAAGGCCA AACAACAAGC CATTTCTTAA TTTCTAACAA 41460 

TGTTGACATT GCAATTAACA CAAGTCCATA CGAAGTTAAA CAAAACATGT TTTTCCCAAA 41520 

AGGAC TAT AC ATATATAATA AAAAAATGAT TTCAAAACAA ATAAATAACT ACGGAGAGAT 41580 

TGTAATAAAG CACAACAAAA TTATATTAAA TCCCAAGGAA GACGAAATAG AAAACTGCGA 41640 
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TTATGGATTT AGCGGATTTT^TTGTTTTAAT CAAAAACGGA AAGTATAAA^^AAATTTTAA 41700 

AGAAACAAGG CACCCAAGAA CAATAATAGG AACTGATAAA AATAACAAGC ATTTATTTCT 41760- 

TGTTACAATA GAAGGAAGGG GTGTCAATAA TAGCAAAGGG GCCTCTCTTA ATGAAGCTAT 41820 

TGATTTTGCA TTAAGCTACG GCATGACTAA CGCTATTAAT CTAGACGGGG GGGGCTCAAG 41880 

CACTCTTGTT GTAAAATCAA ATAACGCTCC TTACAAATTA AACTTCACAG CAAACATCTT 41940 

TGGACAGGAA AGACCTGTCC CATTTCATTT AGGAATAAAA CTTCCTAATT GAAAAATCTC 42000 

CAACCGATAT TAAATCCAAG CATAATCTCA GTTGTTAACC CAGAAAAATT TTTATAATTA - 42060 

GAAAATGGAG AAATAGAAAG CATAAACAAA GGCCTAATAT ATAAACCATC AAGATCGGGA 4212 0 

ATAAAAAGAT CAGCAGCAAG CCCCATGCTT AAAAAAAAGA GATTGAAGTT ATTAGAGCTT 42180 

AAATCAAAAG CATATTTAAA CCCAATTAAT GGGAAAAGCA TTCTAACCTG CTCTTTGAAA 42240 

ACCATTGGAT ATGTTCCATA AAGCCCAAGC GAGAAATATC TCCCATTGTG AGTAACAACA 42300 

AAAGCCTCTT TGTAAGACAT TTCAAAAAGT ACATAATTTG CATCAAAAAA TAAATTCAAA 423 60 

TTAATCCCAT GATCTGCTCT GGTAAAATTT GGAGCAAATT TAGTGGCGCC TGTTTTATCA 4242 0 

GTATAATTAG TAAATTGATA AGAAAAACCT CCACCAAAAG AAAGTGGATA AGAAACAATT 42480 

AAATTATTAG AAGAGATGAG AAATAAAATA AAAAAAAGAT ATTTCTTCAT TAACAATCCT 42540 

TAAAAATTCT AAAAAATACT ATATTATTAT AGTAACACAC TAAAGTAGTA TATAAAAAAT 42 600 

CTGGGAAATT ATGAATACAA AAACATTATA TTTAATATCC TTAATTCTTT TAGCTTGCAA 42660 

TAAAAATAAC AAAATTCCTC TCATTCAAAA ATTAGATTTG CCCAAAAGCA GCATTCTTGG 42720 

CTTTAGCAAT AAAATGGGCA TAATAATAAA AGATTATGCT TTTCTTAGTA AAAGCACTAA 42780 

GAAAAATAGC GAATTGGATT ATGATTACGC AATTCTACTC AGAAAAGACG AAGTCGTAAA 42840 

AATTGAAAAA ACACTAGAAA AAACAGAGCG CTATGGAATT GAAGGAAATT GGATCCTAGT 42900 

CAATTACAAG GGAACTAAAA GATACATCTT TAGCAAAGAC ATCAATATAG TCAACAATTT 42 9 60 

AATAATTGAT CATTCTAAAT AGCTTTACTA CATAACCGGA CAAAAGTCCG ATCAATGTAA 43 020 

TAAATTACTT ATTTTTTTTC TTATGTCTAT TTTTTCTTCT TTTTTTCTTC CTTTTATGGG 43080 

TAGAAATTTT TTTTAATTTT CTTTTTCTTC CGCAAGGCAC TCCCTAAATC TCCTTATCAA 43140 

CCTTAAAAAT TAAATTCAAA ATGTTATCTA AATCATCCTC TGGATGAAGC CATAAAACAT 432 00 

CAGAAATTTT GGGAAAAAAG GTCATTTGCC TTTTTGCATA TAAAAACGAA TTTTTGTTTA 432 60 

TTAAACCTAT TATATCATTT AAACCATAGC AAGGTCTACT TTTCCATAAC AAAAACTCAT 43320 

TATAGCCTAT TCCTTTAAAA GCCGGAGTAT TTTCATTGTA ACCCTTGCTA AATAAACCTT 433 80 

TAATCTCGGA AAGTAGTCCA CTATTAAGCA TTTCATTAAT TCTTATTGAT ATTCTGGTTT 43440 
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TCAAATCTTC AAAAGATCTC TTAAGGCCTA TAATCACAAT ATTTTTAAAT TCGCTACTTT 43500 

GTTTTTTTTG AAATTGGCTA ATAGGAATTC CTGTTTGATA GTAAACCTCA AGCGATCTTT 43560 

TAATGCGATA AATATCATTC TTATTTAACA TATTAAATCT GATGGGATCT ACATTTTTTA 43 620 

ATTCTTTTAA AAGATAAGAT TTACCCTTAA GCTCTAAAAG ATTGTTTACA TAAATTCTTA 43 680 

TTTTAGAAGT AACCAAGGGT GTTGAAGGAA ATCCATCCTT TAAATGCTTA AAATAAAAAG 43740 

CAGTACCTCC TACAAATATA GGAATTTTTT TTTTCTGTCT TATTTCTTTT AC T ATTTTTA . 43800 

AAGCTTGTTC GTAAAAAATT CCAATAGTAT AATCCTTTTC GGGATCTAAA AAATCTACTA 43 860 

AATGATGCTT TATATGTTTC ATTAAATTTT TACTTGGCTT TGAAGAAGCT ATATTAAACT 43920 

CTTTATAAAC TTGAATAGAG TCAACATTAA TAATTTCTGC TTTATTTTTT GGAAAATGAA 43980 

ATAAAATATT GCTTTTGCCC ACAGCTGTAG GGCCAAAAAT AAAAACTACT CTATCTTCCT 44040 

TCAATTGAAT ATCTAAATTT ACCAGTCAAA ACGTCATTAA AAGCTTGCCC TAATATTTTA 44100 

TCATCAAAAA TAGCGTGTTC TGCTAAAGAA ATATTGTCTA TAATTTGCTC AGTGCGCATT 44160 

ATAGTTGCAA CAACAAGTTC ATAAAAATTT CCATCAAAAT CTTGTATTTT TTTTAAAGGC 44220 

ACTTTAGTCA TTTAATCTCC TTATAAACCA AACATATTAA ATAAGTTTAG CTATTTTTTT 44280 

TCAAGTTTCT TTTATCAATT TCATCTCTAA TTAATTCAAA AACTTCATTT GGATTAATAT 44340 

TTGTAACATC AATCACTAAA TCGGTCTCTG AAAAATAATC ATCAATATCT ATATTGTAAA 44400 

TAGCTAAATA TCTTTTTTTG TCATTTTCAT CTCTAATAAA AGTACTGCTT AAAACGTCAG 44460 

AATACATGCC CCCCTCTCTA GTCATTATTC TCTCAGCTCT ' AACTTCCATT - TTAGCATAAA 44520 

GATATATTTT TAAATCAGCA CTCTTAGAAA TCCAAATAGC AAGACGAGAT GCAAGCACTG 44580 

TATTATTTTT TCTAGAAAGC ACAGACAATC TATTATCAAG GTATTTATCC CAATAATAAT 44640 

CATTTCTGCC TATTATCTCT TTTTCATAAA ACTCTGAAAA AGGAATATTA TGCTCTCTTG 44700 

CAATATCATG AAAAGTATAA TTAATAAACT CAAGACCGTA ATGTTTGGCA ATCATCCCGC 44760 

TTACAGTAGT ATTGCCACAA CCACTCTTAC CAGAAAGTGC TATTTTCATT CAACATTCCT 44820 

TATTTGCTCT TTTATTTTTT CTAAATTTAA CTTCATATTC AAAATCAAAT TCTTAATATC 44880 

AAGATCAACC GCCTTATTAC TCATGGTTGT TATTTCTCTG TGCATTTCTT GAGAAATAAA 44940 

TTCAAGAGCT TTACCACATA TCTCATATTC AAGATTTTTA TAAAAAGTTT CTATATGAGA 45000 

ATCTAAGCGC ATGATCTCTT CATTAATGTC TAGACGAATT GCCATTTTAG CTGCCTCTTC 45060 

TGCAATATTT AAATCTCTAA ATTCATCCAT TAATTTAGAA ATATTTTCTT TAATACTTGC 45120 

AAATAATTTG ACATTTATAT CACTGCAAGC ATCTTTAACA ATTTTAAGGT CCCGCTCTAT 45180 



WO 98/58943 

TAACACAAGG 
ATTGTAATGT 
ATGTTCACTA 
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GTTGACACTA 
AATAAAGCTT 
TCCTCATCAA 
TTAATATTAA 
TTGGGATTAA 
ACATTGCCTC 
AAAATTTCTG 
ACACTAAACA 
AATATAACAC 
AAACAAGTCT 
AAAAAAATAA 
ATCTGGATTA 
AGCACTTAGA 
CATAAAATCC 
TTTAATTTCC 
AACAACTTCA 
TTCAAAATCA 
TGCTGAAAAA 
TAAAAAAGAA 
TAAATCTCCA 
TTTAAGCAAA 
AATATACTGT 
GTAGTCAACA 
TTCACAAGTT 
ATCTTTAACA 
ACCTAAAAAC 
CTTTCTAGAA 
TGCCTCAAGT 
AAAAGACCTA 
ATTATTACTT 
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TGTCTGACTT GGTATTTTCT 
CTTCTAGAAC ACCTTTAAAC 
TTATCAAAGC TCCTTTTAAT 
GATTGGTATG TGCCAAAGAA 
TCGTAAAATT CACACTAGGA 
TGCTAATATA TTTTGAAATC 
GTAACCTAAA TTTAAATTCT 
TATAGTTACC AATTATCTTT 
CCTTACAGAA AATCGTGTAA 
CCAGATATTA ATAAACTTTT 
GTATTCTTAT TTATTTTTTT 
AAATTTTCCC TATTTGAAAG 
ACTTCAACAA AATCGGCAAA 
AAAATTATAC GTTTATTCTT 
CTAGGATGAT GAGCATAATC 
ACCCTTCTTT TTATACCGCT 
AAAATTGATT TCCCATTACT 
TTTAATACAT TATGAAATAA 
AAACAAAAAT ATTCACTCCT 
GACCCATAGC TAAAAATACT 
TTATTATCAT CGGAATTAAT 
AAAAAAGCCT CTTCAAGAGC 
TTGGTTAAAA TAAGCATATT 
TCAACAATAA AAATATTGCT 
CTTGACCCCA CAATAACATT 
GCCGTAGTGG TAGTTTTACC 
AGCTCTCCAA GAGCCTCAGG 
AAAACTTGCA AACCATCCTT 
TCAAGCTGTT TTAATGAAAA 
AAAATTTCAT CGGTATAAAA 
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CCTTCAAAAC^TTCTTCCATT 45240 

AACCCATAAA TCTCTTCTTG 45300 

GATAAAAAAT GGCCCAAACT 45360 

TCTCTAAGCC TAGAAATAGC 45420 

ACCAATTCTT TATATCCTAC 45480 

AAATTTCTTA TATCAAGATC 45540 

AAAAACTTTC CATTATAAGA 45600 

TCCAAATAAA AAAATCCCGT 45660 

AATAAAATTA TTTCCAGCGC 45720 

TATAAAATTA ATAGAGTCTT 45780 
AATATTTAAA AACAATTTAA . 45840 

ATATATATTG TGCAAAATTA 45900 

AAATTCTTTT GTTCTTGTAA 45960 

ATAAAAATTT TTAATACCAA 46020 

GTCCATGTAA ATCACTCCAT 46080 

ATAATTTTTT GCAATTCTCT 46140 

TTCTAAGAAA AGATTTAAAG 46200 

AACAGTCTTA AGCTCAACAT 46260 

AACTGCAATA TTACTTATTT 46320 

TATATCTTTT CTGTTGATTT 46380 

TATCAATATT CCATTTTTCT 46440 

CTCATAATTT TTAAAAAAAT 46500 

AGGGCTAAAA TTCAAAAAAT 46560 

AATACCTGCT ATTGCAGAAT 46620 

GGGATTTAAT CCTAATTTAT 46680 

ATGAGAACCT GCAATTCCAA 46740 

ATAAGATAAA ATAGGTATAT 46800 

ATTATAGGCT GAAGAATATA^ 46860 

CTCATAAATA TTATCATAAT 46920 

TTTATCAGAA ACATCTACCC 46980 
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CTTCTAGACA ATACCCTTTT GAATTTAAAA AACAAGCCAG AGAACAAGCC CCACTTCCCT 47040 

TTATTCCTAC AAAAAAAATA TTATTCAAAT CGTCAAAATC,,AACCTTCATA GCTCTCTTCT 47100 

AAATAATCTT TTTTGTCATA AAATTTATGT ATAACTATAT CGATAGATAA TATATTAATT 47160 

ATTGTATACT TAAGTACGTT CGTAATAATA TAGGTTCTAA GCATTTCTGT ATTATTAAGC 47220 

AAGTTATTTC CAAGAGGAAT ATTTAAAAAC AGTTTAAAAA GGTAATTGTT TCCATCTTTT 47280 

TTGATTTTAA GATCATAAAC AATATAATTT CTATCATACT CAGAAACACA ATGTTCAATA 47 340 

ATTTGCCTTA CAGCTCTCTT AGAAATAGAT AAAACCCCTT CTTCATAAAA ATGGGGCCTT 47400 

ACAACAGATC GAATATAATT TTTCTTCCTA GCAAAGAACC ATCCGCTTTT AAAAAAAACT 47460 

TTAATTGAGT TTAATAGCAA ATTGGGCCTA ATAGACGTTA TTTCAAAAGC AGCTGCTGGT 47520 

ACAACATGCT CCCCCATTTG CCTTGAAATT CTTGCTTTTT CTATTTCCTG CCTGGTTGAA ' 47580 

ACATCTGTTA TGTAAATAAT CTTAAAAACA TTTGGCAACA AAAGTTTTGA AATTATTTTA 47 64 0 

TCTATCATTT TAAGACTTGT TCCTAATATT AATATTTTAT TAAATTCTTC TTTTGCAAGC 47700 

ACTTCAAGCA TTTCTCTGCA ATGAGCATCA TCTTCAAATA CAGATCGCCT TACCGCTTTA 477 60 

AAAACATTAT CTTCAAACTT AGCAGAACTT CCGGCAATAA TTTTCATATT TTTAATTAAA 47 82 0 

ACACCATCAT. CAATTATTAA AGGTATTGAA TATTTATCTG CTACCAAATG CGATCTAAAG 47880 

CTCTTGCCAG TTCCAGCAGA TCCTACTAAT GCATACACCT TAAGCCTAAC AAATTTATGC 47940 

TTAATGCTAA TTGCAAAGTC TTTAACCTTG CTTAATATTT TTTTTAAAAA AAAATCACTT 48000 

- GAAAAATTCT GAAAATTCAT AGACATCAAA AATATCTTTT ACTTGCTTTA AAAATCTAAA 48060 

CTAGGCTTAG AGCAAAATTC ATTCTTTTTA AAGAATGTTC CATTAAATTT TACCAAAAAA 48120 

GCATGTAAAA AGTAATCGCT TTTCTTGAAT TTATTACAAT ATTTCTTGTC ATTAATTAAA 48180 

GGATGATTGT TAAAGGAACA CTGAGACCTT ATTTGATGGG TAAAGCCTGT TTCAATAACA 48240 

ATCTCGACAA GAGTAGCTCT TTTGCAAGAT AATATTGGAT TAACCTTTGT AATTGC ATTA 4 8300 

ACAAAATTTT TATCTTCTAA AACAAAAGTT TTTCTCAACC TTTTATTTCT AAATAAATGA 48360 

TTTTTATAAA CAACAGGAGA CTTAACCTCG CCTAAAAGTA TTGCAAAATA TTTTTTAATT 48420 

ATAGATCCAC CACTAAATGC CTCACTTAGC TTTCTTGCAG TATTTATATT TTTTGCAAAA 48480 

ATAATAATAC CAGAAGTATT TCTGTCAAGC CTGTGAACTG CCGAAGGCTT AAAGCTTAGG 48540 

G ATC TTAAAT TTTGACTTAA AAGATAAGAA TTCACTAAAA AATCAAGAGA ATTTTTACCT 48600 

CCATGAACTA AAATACCTTT TTGCTTATCT AAAACAAGTA AGTCACTGTC TTCATAAATT 48660 

ATTCTTTTTC GAATATATTG AAAATCAATA TTGCTTTTAA AGCATTTATC CGTGGTTAAG 48720 
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TTCAAATTTT GGGCTAAAGA TTTGTACAAA TAAATTTTAT CACCTTTGCA "AACTCTGCAT 48780 

GAAAAATGTG ATTTTAAACC ATTTAGCCTA ATGTCACCTT TTCTAATATG TTTTATTATA 48840 

CTCGCTTTAG AAAAATTTAA AATTTTAATT AAAATTGAAT CTAGTCGCTT GCCATTATCA 48900 

TTAGCAAGCA CTTCTAAAAA AATATATTTA TCCAAACGCA AAAAATACCC CTAACAAACC 48960 

TTACTATTTT TTTTACAAAA AAAATTAACT AC T AAAAATG TAAATATAGA AACAAAAAAT 49020 

GATGGAAAAA CGGGGTGAAA AAACCAAATA TTTAAACCAA AGAATAAAAT GGACAAATAA 49080 

AATATTAACC CTAAAAACAT AGAApCAAAA GCCGCTATTT TGCTTACAAA ATTTAAATAA 49140 

AGTCCAAAAA CAATAATAGG GAAAAACGAA ACTTCCAAAG CTCCAAAGGC AAAAATATTA 49200 

ATAAAGAATA AAAAATTGGG AGGAAAGAGA GAAAATATAA GTATTATTAA AATAAAAAAA 49260 

ATATTAGAAA TCATTATTAT TCTGCCAATC TTTACATCTT GTTTTAAATC TTCTTTATAA 49320 

ATAAATATTG ACTTTATTAA AACAGATGTT ATTAATAGCA AATTTGAATC CACTGTAGAC 493 80 

ATTATTGC AG ATAAAAGACC TATAAAAAAC ATAAAACAAG AAAAAGGATT TAAAACTTTT 49440 

AAAGCCACAT TTAAAACAAC TTTATCATTT GGACTTAAAT CTGGAAAAAG AATAATAGCA 49500 

AAAAACCCTA TTAAATGCAT CAAAACAATT AAAAAGCTAA TAATAAAAGT AGAAATGGGA 49560 

AGAGAAAATT TTATAGCATT CTCATCTTTA AATGCTATAA AATTATTAAT AATCTGAGGC 49620 
TGCCCTAGTA TTCCTATTCC TATTAATATC CAAAAAGAAA TTATATATTG TGGCTTTAAG ' 49 680 

TCAGCATTTG AAGGAAGTAA AAGGCTTTTA TCTAAGCTAG ACGTTGCTGT TTTGAATAAA 49740 

TTATTAATAC CCCCTCCCAA ATCTAGCATC TTGGAAAACA AAATAACGGA TGAAACTAGC 49800 

ATTAAAAATC CTTGAATCAA ATCCGTATAA GCTACTGCCT TAAAGCCGCC AAAAAATACA 49860 

TAAATAAAAA CCAAGAAGGC ^AAAAAAAGTA AGACCAACTA CGTAATCAAT ACCCCAAAAA 49920 

ACTTCTATAA GTTTGGCACC ACCTATTAAT. TGGGCAGAAA TCAAAAACAT TGAAAAAAAA 49980 

ATCAATACAA ATCCACTCAT TAACGCCAAA AAATCACTTT CATATCTATG CCTAATATAA 50040 

TCAATAATAT TAATTGCATT AATTTTTTTT GATTCGCGAT TTAATCTCTG ACCAACAATA 50100 

ATAAAAACAA TTAAAGTTGT AGGAATTTGT ATGGTAGCTA ATAATATAAA AGATAATCCA 50160 

TACTTATAAA CAGCAGAGGG ACCGGAAATA AAACTACTAG CACTAATATA GCTAGAAGAA 50220 

AATAACAAAG CCATAACAAT AAAATTAATA TTTCGATTTG CAAGAAAATA TTTATTTAAT 50280 

AACAAAAACC TACCTCTATT TCTTTTTTTA AGAAAATCTA AAAATAAAAA AATATCAATA 50340 

ATGTATCAAG TTTTACTAAT TACTAAAATA AAAAAACAAA CCAAAAAAAA ATTATACTGG 50400 

GGAATAAAAT TCC TGAC AAA AAAAACCACA AAGGAATATT AAATATAGTA GTTGATGTGT 50460 

CAATAAAATA GGCAAAACAA AACCACAATA CAAACATAAA AACATACAAT AATATAGCGT 50520 
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ACAAAATCCC ATTTCTCATT TAAAACAACC TTAACAAAAG GACATCAAAA TTTATAAATT 50580 

CAAATATAAT GTATATTATA TAATATATAT TATATGGATA AAAATAAACA TATATTAATT 50640 

GGTATATGTG GGGGCATAGC CTCTTACAAG TCAGTTTACA TAGTTTCCAG TTTAGTTAAA 50700 

TTAGGATACA AAGTTAAAGT TATAATGACA CAAAATGCAA CTAAATTTAT TACTCCATTA 50760 

ACTTTAGAAA CCATTTCTAA GAACAAAATA ATTACTAATT TATGGGATTT AGACCACAAT 50820 

GAGGTGGAGC ATATAAAAAT TGCAAAATGG GCACACCTAA TTCTTGTTAT TCGTGCTACC 50880 

TACAACACAA TATCTAAAAT TGCATCAGGA ATTGCTGATG ATGCATTAAC TACAATAATA 50940 

TCTGCAAGCA CGGCTCCTAC TTATTTTGCA ATAGCAATGA ATAATATAAT GTATTCAAAC 51000 

CCTATTTTAA AAGAAAATAT AAAAAAGCTT AAAACTTATA ATTATAAATT CATTGAACCT 51060 

GATAAAGGAT TTTTAGCTTG CTCATCAAAT GCTTTAGGGC GCCTTAAAAA TGAAGACAAA 51120 

ATTATAAAAA TAATATTGAA TGAATTTAAT CAAAAAGACT AGCTAAAAAA TAAAAAAATA 51180 

CTTATAACAG CATCCAGAAC TGAAGAATTA ATAGATCCAA TTCGCTATTT CTCAAATACA 51240 

TCAACGGGAA AAATGGGGTT TTGCTTAGCA CAAGAGGCTG TCAAACTAGG AGCTCAAGTT 51300 

ACAATTATTA CAGGACCAAC CAATGAAAAT GATCCTGAAG GGGTCAACAT TATAAAAATA 51360 

AAAACTGCAA TGGAAATGTA CAAGGAAGCT CTCAAAATAT ATAATAAATT TGAAATAATA 5142 0 

ATTGGAGCCG CAGCTGTTGC CGATTTTAAA CCCAAACACA TTTTCAATAG TAAAATTAAA 51480 

AAAAATAAAA TCAATAGATT ATATATAAAA TTAGTAAAAA ATCCCGACAT AATCCAACAC 51540 

ATAGGACAGA ^TAAGCTTAA AAACCAAATT GTTATTGGAT- TTTGCGCTGA GAATTCTAAA 51600 

AATTTAATTC AAAAAGCTAA AGAAAAATTA AAAAAGAAAA ACTTGGACTT TATCATTGCA 51660 

AATGAACTTA AATATTTTGG TTCAAAATTA AACAAAGTTT ATATAATAAA TAAACAAAGC 5172 0 

ATAAAAGAAC TGCCAGAAAT -GGAAAAATCA GAAGTAGCTA AAGAAATTTT AAAAATTTTA 51780 

TACTAATATG CTTAATAGTT TATTAATAAT CAATAATCTT TTAAAAGCAT TTTAATATAT 51840 

TCGGAGTTCG TTTCACTTTT AATTTTTTTA AGCTTATCAG AAAGCGATGA AGATTTATTA 51900 

TAAACAGCTC CTTTTAATGC TAAATGCCTT AAATCAATAT TTTTGCTGTC AATGATTTTA 51960 

GAATAAAAAT TATCAAAATT ACCCTTTTTA CCAGCCAACA TTGAAGCAAC GCCCCTTAAA 52020 

ACATTTGAGG GTCTATTAAT ATTTTCTTTA TTAACAATTT CTAAAGCAAT TGACAATGCT 52080 

TTTAGAGAAT CCTTATCTAA AAGGTAACTA AACATTGAAA TTTTAAAATT ATTGTCAATC 52140 

TTAAAATCAA ACATAATGTT TTTTATCTCA ATATTCCCAA GATCCATATC AATTAAGGCC 52200 

TTAGCAGAAG CCTCCCTAAC TTTAAGAGAT GGATCGCTTT TAAGCTTATA AATCAAAATA 52260 
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AAGAGTCCCT XTGTCCTTTG ATTGCATTAA 
CTCTTAAAAA . TCCTTGTAAA ATCTCTTTAG 
CAATAATAGC TAATTTAACA TTTAAATTAT 
TTTCAGTTAC TTTATCTGAA GCAAGATATG 
ATGGGCCCTC GTAATTATCT AGCGAAATTT 
ACATTTTTCC AAGAGCAATA AGTATTTCTC 
CAAAAACTTC CATCATGTTT TTAGAATACT 
CTGCAATAGA TACCACATTG CCCTCTTTAT 
ATTTTTCTTT ATCATCAAAC TCCTTAAGAT 
AATATCTTTT ACTCTCATAA TTTTCAAGAA 
ACTTAAGAGA AATAAACAAT . TCAAGTATTT 
CAAGTCTTTT TTTAAGAGAA AAATTATATT 
TAATGCTTGT CACTTGACTA TCAAGCCCAT 
CTAAACCAAC ATTAGAAAAA TTCTCTCCCT 
TTTCTGTAAT TTCGGGCAAC AAAGGCGGAC 
CATACACATT AAAAATAAGT AAAAAAAATA 
TCTAATATAT CTAAAAAGCT TATTTATTCC 
AATACTAACA ATTCCAGCTG CCATTAAAAA 
CCACTGAAAC TTTTCAAAAA AGAAATAAAT 
CTTTAAAAGA ACAAATAAAA TTTCAATTAA 
AAAATAAAAA ACAATTACAC AAATCATAAA 
CAAACCATAA TAATTAATAC CAAAAACAGA 
ACTCAAATAA AACGGTGTTT TTGCATCACG 
AAACATTGAA TAAAAAAGCA GACCTAAAAG 
AGTATCATAA ATAGAAAACT TGCCTCCCAT 
AATAAACATT AAAAAAGACA CTGGAATAAA 
TAAAAGGGCA TTTAATTTTA TATTATTCCC 
AATCACTGTT GCAATAGAAA TATAAAAAAT 
ATTACTAAGG ATAGAAACAC TTCCTATCTC 
CTGAGTAATA ATTGAAATGG GAAAATCCAA 
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TAGCTTTAAA^CTAATATTA 5232 0 

ACTTTAAAGA AGGATCTTTG 523 80 

TGTTATTACT CTGAAGATAC ' 52440 

ACAACGCTTC GATTGCAGCA 52500 

CATAAATTCT ATCCTGATAA 525 60 

TTCTAGCCCC ATCATTTCCA 52 620 

CAAGAGAATT AAGCTCTCCT 52 680 

TTTCAAGAAT GTCAATAAGA 52740 

ACGAAATTGC CAAGCCAAAT 52800 

TATAATTTGC TGTATCAATG " 52860 

CCCTTTTAAG CTCAGCATTA . 52 920 

GACTATCGCT TGATTTTTTA 52 980 

AAAGAATTGT ATCGTTAACA 53040 

TAGAAGAATT TTCTCTCTCA 53100 

TAGGAAGAGC TGGAGAATTA 53160 

AAAAATAAAA GTATTTCATA 53220 

TAAAACAGAA TAACAAATAA 532 80 

ATAAAGATTT TTAAAACTAA 53340 

TGCATATAAA GGAAAAAGTG 53400 

ATCAATTTTA ACTCCTCTTT 53460 

AGAAATAGAT TGAGCTAATG 53520 

TATTGCAATA TCAAGAATAG 53580 

AATAGAAAAA TAATATTTTT 53640 

AAAACATTTC AAAACACTCG 53700 

AAGAAATAAA TTTAAAAT AT 537 60 

AATTAACAAT AAAATTTTAA 53820 

CAAAACAGCA TGCTCTGCCA 53 880 

TCCTACAGGA AGCTGATAAT 5 3940 

AAGAGTAGAT GCTAATGCAA 54000 

GAATCATACG AAGCCATCTG 54060 
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GTTAAAAATT TAAAACCTTT TCTCTGAAAT AAAATGTTGG CTTCCAGGCA AAACCAATCA 54120 

TAAGGCAATT TGCAAACGGA ATTAAAAATT GTAAAAACCC CCCAAAAATT ACGCCAATAA 54180 

CAGCACTATA TATTCCAAAA CGACCATAAA ATAAGAATAT GCTCAATATT ATTCCAAAAG 54240 

AAAGCATAAT GGGCGAAAAC GAAGGAATGA AAAAAATTTT ATATGAATTT AGAACAGACA 54300 

CGAAGATTGA TGATAGGCTT ATTAGTAAAA TATATAATAC CAAATAACCA AATACAGAAC 54360 

TTGCAAAAAT TAAGTTTTCT CCCCTATAAT AAGATATAAA ATACATAATA GGCTTTGCAA 54420 

AAATAATCAT AACTAAAACA ATTAACCCAA TAGAAATAAT GTTAAAGGTT ATGACAGTTC 54480 

TGAAAAAAGA AACAGCTTTT TCGTGCGATT TGTTTTTTTC ATGTGTAAAT TCAGGCAAAA 54540 

AAGCCGAGGT CATCGCGCCC TCTGAAAGAA TTTTGCGCAA ATTATTAGGA ATATTGAAAA 54600 

CATAGTTAAA AATATCAGCA TCAAGATTTG CACCAAAATA ATAAGAGAAA ATCTTTATCT ^ 54 660 

TTACAAAGCC CATTATTCTT GAAAAAAAAG TGGAAATCAT ^pACCAAAATT GTAGAAACAA 54720 

CATATTTATT CATCGAAATT TTCCTCTTCA TACTTTTTTA AATATGAAGT TCTAAAATTT ' 54780 

AAAAAATCAT CATTTAGAAT TGCGGCTCTG ATCTTTGAAA TCAATCGAAA CATATAGTGG 54840 

ATATTATGTT CACTTGCCAA AACTATTCCA AAAAGCTCTT TCGATTTTAT TAAATGTCTT 54900 

AAATATCCTC TTGAATACCT TTTACATAAA GTACAGATGC AATTTTTCTC TACCTTAGAA 54960 

GTATCATCCT TATACTCCTT TCTACCAATG CACATAATCC CATTATCTGT CAAAAGAGAC 55020 

CCATGCCTAG TAATTCTTGC GGGATTAAAG CAATCAAAAA TATCAATGCC ATAATATATG 55080 

GCATTAAGTA TGTAATGGGG AGTGCCAATA CCCATTACAT ACCTTGGTTT TTCTTTTGGT 55140 

ATCAACAAAA AACTATATTC AAGGATTTCT AAATATTTCT CCCTTGGTTC TCCAACAGAA 55200 

ATGCCTCCAA TGGCAATACC TGGGCTGTCT AATTCCAATA TATCATTGAT ACTTCTTTTC 55260 

- / 

CTTAAATCTT TAAAAAAATT TCCTTGAGTT ATTAAAAATA AAAGCCCGTT GTATCCCTCT 55320 

TTTCTGTTTT TAGAAGATTT GAACGTGCTG CTAGCCCAAT TGGTTGTAAT ATTTGTATAT 55380 

AAATTGGCTT CATTATAATC AATCCCATAA GAACTGCAAA TGTCAAGTGG CATAATAATA 55440 

TCACTGCCAA AAATTTCTTG CATAGCAAAT ATTCCCTCGG AAGTAAAATA ATGGTACGAT 55500 

CCATCTATAT GAGATTTAAA ATGCACACCT TTTAGATCAA TTTTTCTCAG ATCAGAAAAA 55560 

GAAAACACCC GAAATCCGCC CGAATCGGTT AAAAAATTTT TATTCCAAAT TGTAAAATTA 55620 

TGAAGACCAA CATATTTTTC AACAGTTTTA ATTCCCAGCC TTAAATATAA ATGATAAGTA 55680 

TTTGCAAGCA TCAAATTACA TTCTAACTTC TCAAGAACAG CATGTTTTAA CCCTTTCATT 55740 

GCCCCCAAAG TACCAACTGG CATAAAACAA GGAATATCTA CTCTACCATG AGGAAGATTT 55800 
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AAAAATCCAA CCCTTGCATT AAAATGCTTA TCATTCTTGA TTACACTAAA CATATAAATA 55860 

TCCCAAATAA TTATATATAT TATTTCCTAA CAATCCTATA AACAAAAAAA TTGATAATCA 5592 0 

AAAAAAATAT - AGTTGTAAAA GAACTTGCTA TATATATATG CAAATAGCCT AGATCCGCTA 55980 

AAAAATTAAA AATTACAATA GAAATAACAT AAACAACAGC AAAAGCAATG CTATTTAATA 56040 

GGCTGAGTAT AAAAATATTT TTTTTAAGAG CAAGAGCAAT AAACCCAACA GTAAAGCTAA 56100 

GAAGAATCAA TCTAAATGAA AAGAATATCC TATTTAATAA ATCAAAAAAT GCATCAGAAT 56160 

AATTTAAACG TTCAGCTTTG AGAAAACTTA TCCAATTAAT AAGTTTAGTA AAATTTAACG 5622 0 

CCTTTGATGA GAGCATCACA GTTCTTATGT AATCGGGCGC CAGCTTAATA ATTCCTGTCC 56280 

CATCAAGAAC ATCGTAGGCG TTCTCCTTAA TTTTTTTACC AACCTTAACA AACTCTCTAA 56340 

TACCATAAAG CCTCCATTTA TTATCTTTCC ATTCGGCTTT ATTTATATCG TACCTTGTTT 56400 

GAAACTCATC TTTATTGTCT TTAATTATAA. TCATCAAGTT AGCAAAAGTA TTCTCATCAA 564.60 

TATCATAAGA TTTGATATTA TAAATTTCTC TAGCAAAATC ' CCTTATTATT ATAGTTTTAT 56520- 

CCCCAGATCT ACTGTCGCCA ATGCTATTCT TAATAAGAAC ATCTCTTCTT GCTATAGTAT 56580 

CTATTACCAA ATAATTATCA AAAAAGAAAA GAACAACTGA AATAAATATA CTAATTAAAA 56640 

TAATTGGTTT TAATATCCTG GTAAGTGGAA CTCCACAACT AAAAAGACCT ATTATTTCAT 56700 

TTCTCATAGA AAGATTGCCA ATAAGATTCG AAATAGCAAA AAGAAAAGAT AAAGCCACCC 56760 

CATCTGAGAA TGCCTTTGGC AAATATAAAT AATAAATATA AAGAATATCC TTAAGGCCAA 56820 

TATTCTTTTC AAGATAGTTA AGAAGATTAA CAAACAAATC ACCAAGCATA ATTAAAATCA 56880 

TGAAAAGC AG GTTCATGGAC AAAAAAGTAA GAATGATGCT TTTTATAAAA AGCTTATCTA 56940 

TTTTCATTTT TTTAACAATC TCAAAAAGAG AATTGCTCCT GCAATAATTA AAATTAAATT 57000 

AGGCAAAATA GTAACAATAA TAGGACTTGG TGCATACTGC ACAGTATAAA CTTTTCCACC 57060 

AATAAACATT ACCCAATAAA AAACACAAAC AATAATTGAA ATTACAAGTT CAAGAATAAT 57120 

GGAATATTTT CTATTAGAAT ACATTCCCAT TGAAAAAGCT AAAAAAATAA AAAATAAAAC 57180 

TGAAAGTGGT AAACTAATTT TTTGATAAAA TTCAAGATTA AACAGAGCCA AATTTTGCTT 57240 

CATGCTTCTA TCTTGATAAG GTTTGAAATT TAAATTTAAA TTATACATAT AGTTTAAATT 57300 

TTCAAAAACA TAAGATTCAT CCACATAATA GTTTTGATTG TATAAATAAT TTAAATAAAG 57360 

ATTTGAAAAA TTTAAACTTA AAAAGTCTCC TTCTAGATTA TTTTTTATAT TTGAATCTGC 57420 

AATTAAATTA TTTTGCTTTT TAATTAATTT TATAACATCT CTCATGCTCA TTTGTGAAGG 57480 

AGTTACATAA TTTAATAAAA AACTATCACT AAATGTAACC TGATCGATTG AATATTTCAT 57540 

CTTATCTGCA TAAAAATAAT CATAAAATCC ACTCTCACTG TCTGTTAAGG CAATAGATAG 57 600 
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AACATCATTT AAAATAAAAT ACACTTGAAA ATTTTCTTTT CTAATATCAA GATTTTTTGC 57660 

CATAAATATT CTATCAAAAC CCTTAAGCCC AGTGTTATCA AAAAAAGTTA CATTTTTATA 57720 

ACCATTTTCC GATTTCTCAC CAGAAACAAA AATCAAATCT CCATATTGTT TGCTTGAATA 57780 

AGGCTTTAAT ACCAAATGGG GAACTTCTTC TTTTATTTCA TTAAAAATTT TTAATCTGCC 57 840 

AATAGATCCA AGTGGAAGTA AAATATCATT GGATATAAAA GATACAAAAG CAATAACTAT 57 900 

TCCCAATTTA AAAAATGGGA CAAGTAAATC AAAAATTGAT ATGCCAATTG AACGAAAAGC 57960 

TAAAATTTGA TTGTGAAGCT TGAATTTATG AATAGTAAGA ATTACTGAAA TCAAAGAAGC 58020 

AAAAGGGGGA GAAAGCGCAA TAACCATAGG AAGAGAATAT ATAATAAAAA TAAAAGCCTT 58080 

AAAAAAGGGA ACATAATTTT GAAGAAGTAT TCTCATAAAG AATAAAATTT GATTTATAAA 58140 

AAATACGAAA AAGAAAAATA AAAACGTAAT TAAAAAATAT TTAAAAAATT CAGCAATTAT 58200 

GTAAGACTCA TAACTGTTTT TTAATATTTT CATCTACAAA AACCAAACCG TTATTAATAG . 58260 

TGCCCAGTAT GACATAATTT TC AAATC TAG AAACTTTTAT TAAATTCAAA TTAAGAAGCC 5832 0 

CATTATTGGG TCCAAAATAA TCCCAGTTGT CATTTTCAGA ATCATAAATC AATAACCCAT 583 80 

GATCAAAGGT TGCAAACAAT AGCTTTTTAT CTTTAATCTC CATATCCATA AAATAATTAA 58440 

CATCAATATT ATTGGCAATA ACGTGCTTTT TGTAACTATT TTTATTTAAA TTTAATTCAA 5 8500 

AAAGACCCCC ACCATATGTT CCAACAAAAT AACTATCTTT ATATTCTTTT ATAAAATTAA 5 8560 

TATTTTTTTC ATTATCATTT TTGCTAAAAA AATCCAAATG TTCAATCTTT TTCAAATTAT 5 8620 

GGACATTAAC ACTATAAATA GCCTTGTCAA QTGTTCCAAC TAATAATAAA TTTTTTAAAC .58680 

TATCAAAGCA GAGTGAAGAA ATTTTATTAG ATCCAAGCGG TATATTTTTC CAATTTTTTA 5 8740 

AATCATAAAA CCATAATCC A GAATTTAGAG TGCCAACAAA TATTCCATTT TTAACAGCAA 58800 

GCAAAACTTG TACATTGCTA AAATCAGCAT TACCGGGAAC ATTTATTTGC TTTAAATCCC 58860 

CATCAACATC ATCTATATAA TAAACAACAT TTTTACCACC AATATAAATT GTTCCATTAT 58920 

AATCCGCAAA ACCCCTAATG CCATTTAAAA AAATGCTTTT TTTATCCTTA AGATAGACTC 5 8980 

TACAATCATT TTTTTTAATA TTATATCTTA AAAGCCCTCC CAATATATTA GTTACAAATA 59040 

TATTGTCATT AAAGACAAAT GTATCAAAAA CGCTGTTGTC AAGAAATCCT AAAGACTCTA 59100 

AGTTTAAATG ACTTACTCCA AGTTTATTAC TTAAAAAGCC ATAAATCTCT CTATCATAAC 59160 

CTGAGATAGA AAATTCATTA ATCTCTTTAA ATGCAGAAAT TGCATCAGAT TTTTCTTTAA 59220 

CAAGATATTT TAATTCAGCT AATCTTAAAC TAGCATGAGA ATATTTATAG TCTTTTAAAA 59280 

ATAGATCAAA ATTATATTCG GAAAGATCAT AAAAACCATT CTCATAATTT ACATATCCAA 59340 
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AAAACAAATT CGCTTTAGCT AGCAATTCTC TGCTGTGCTG ATTCTCATTA GTTACTATTT 59400 

TATTCAAATA ATAATTTGTC AAGCCAATAT TTTTTTTAGC ATAACTTTCC CTGGCTTTTT 59460 

TAAAATAAAA ATTATTTTCT TTTAAAAAAG CATCACTAAG AAAAAAAGAT TCATTATCTT 59520 

TTAATCCTAT * CAAATAGTCT CTTTTAAATT TACCCTCATT ACTCCCAAGA ACAACATTAT 59580 

TATCATCAAT AATGACTGCT TCTTTTTTCT TCTCTACAAT- ATCACTAATA TGAGAATCTT 59640 

GAATAGATCT ATCTGTAGTA AGGCAGGAAA AAAACAAGAA AAAACATATA AGACAACCTT 59700 

TAAATAAATT ATTCAAAACA AATTTCATAT TATTTTAATA ATCTTTATCT CTAACAATTT 59760 

CAAGATCAAT TTTTCCAAAC TTGTCTATAT CAATTATTTT GACCTTAATT CGCTGACCTT 59820 

CTTCTAATTT TGGGGGTCGA ACTAACCCCG CATTACCTCT TATATTCTCT CCACCGCCAC 59880 

CAAATCTAGA ATATCTATTA CTATTCCCAA ATCTTCCAGA ACCATACTTA CTGTCTCTGG 59940 

GTTTCAAACG AGTACTTAAA AATCCTTCCT TTGCAGGAGT AAGTTCAATA AAAGCCCCAA 60000 

AGCTATTAAT CTTTTTGACA GTTCCTTCAT AAATTTCGCC TACCTTTGGC TCTCTTACAA 60060 

TACTCTCTAT TCTTTCTTTA GCTTTTTGCA TCTTAAAATC ATCATCCCCG AAAAGAATGA 60120 

TTTTTCCATT CTGCTCAATT TGAACCTTAA CTTCAAATTC ATCTGTTATA GCCTTAACAG 60180 

TTTTTCCAGT AGATCCTATC ACAAGAGATA TCTTGTCAAT GTCAATTTGA AGTTGAACAA 60240 

TTTTAGGAGC ATACTTAGAT ATACCAACTC TTGAATTAGA AATTACAGTA TTCATAATAG 60300 

ATAATATATG TATTCTACCT ATTCTTGCTT GCTCAAGAGC ATCTCTCATT AAATCTTTAG 603 60 

TAACATTTTC AATCTTAATA TCCATTTGAA ATCCAGTAAT TCCATTTTTT GTACCGGCCA 60420 

CTTTAAAGTC CATATCACCT AGATGATCTT CTTCTCCAAG AATATCACTT AAAAC TAG AT 60480 

ATTTATCCCC TTCGCTAATA AGCCCCATGG CTATCCCCGC AACCTGCCCT TTAACAGGAA 60540 

CCCCTGCTGA CATTAAAGAC ATGCTCCCAG CACAAACAGT AGCCATTGAA GAAGATCCGT 60600 

TAGACTCTAA AACCTCAGAA ACTACCCTAA TGGTATAAGG AAAATCATTT TTTCCAGGAA 60660 

CCATTGATTC TAAAGCTCTT TGAGCTAAAT GACCATGGCC AATCTCGCGC CTGCCAGTCA 60720 

TTAGTCTACC GGTCTCACCA ACTGAAAATG GGGGAAAATT GTAGTGGAGC ATAAAATTAA 60780 

GGCGTTTATC GCCATCAATA TCATCCATTA TTTGTTCATC AATGCTTGTA CCAAGAGTAG 60840' 

TTACCGCTAA AGCTTGCGTC TCTCCCCTTG TAAAAAGCGC AGATCCATGC GTTCTACTTA 60900 

AAATATCAAC TTCTGAGATA ATATCTCTTA TCTCATTAGG AGTTCTGCCA TCTGTTCTAA 60960 

TATTATCGTT AAGAATAGAG CTTCTAACAA TCTCCTTCTC AAAATCATCA AAAGCCTTAT 61020 

GAAAAAGAGA TTCATTGCTA TCAGTCAAT , IW>rCTCAAGAGA AGAAAAGTAC TCATAAGATT 61080 

TATTTCGCAG CAAAGTTATG GCTTTATCTC TATTAAGCTT TCCCTTAACA AAACAAGCTT 61140 
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CTTTAAGATC 


AGCATAAACA 


AAATCCCTAA 


GCTCATCTTT 


AAATTCAAAT 


ATTTTTTCTT 


61200 


CAAAAGCTAA 


AGGAAGTTTT 


TCCTTCTTGC 


CTACAATATC 


TAAAAATTCT 


TTTTGAGCAT 


61260 


TACAAATTTG 


CTTAATATAT 


TCATGAGCAC 


CATCTATTGC 


TGAGAGCAAA 


ATATCCTCAC 


61320 


CAACCTCATT 


AGCACCACCT 


TCTACCATAG 


TAATTCCATT 


TAAACTTCCG 


GCAACAACAA 


61380 


TATCAAGATC 


AGAATCATGA 


ATCTCTTCAA 


ACGAAGGGTT 


TACTATAAAC 


TTACCATTCA 


61440 


AATAAACCAT 


TCTAACAGCT 


GCAATTGGAC 


CATTAAACGG. AATATCTGAC 


AAAAAAACTG 


61500 


CCGTAAAAGC 


AGCATTCATT 


CCAACAATAT 


CAGGAGGATT 


AAGCTGATCT 


GTAGCTAAAG 


61560 


TTGTAGGAAT 


TAC TTGAATT 


TCTCGACCAA 


ATCTTTTATC 


AAAAAGAGGT 


CTCATCGGCC 


61620 


TGTCTATTAG 


TCTGGAAACA 


AGTATTTCTT 


TATCCTTTGG 


CTTTCCTTCT 


CTTTTGATAA 


61680 


ATCCTCCCGG 


AATTTTACCG 


GCTGCATAAT 


ATTTCTCATT 


ATATTCAACA 


GAAAGCGGAA 


61740 


• CAAAATCTAA 


ATC TTCTCTC ACGTTACTCG AGCAAC AAAC 


AGTTGCAAGA ACCGAAGATC 


61800 


CACCATAAGT 


TGCAAGAACC 


GATCCATTAG 


CCTGTTTAGC 


CATAAATCCG 


GTCTCAAACA 


61860 


CTAACTCGTC 


TCTGCCTATT 


TTCAACTTTA 


ATATTTTCCT 


CAAAATTCAA 


CCTCTTTTTA 


61920 


TTTTCTAAGA 


CCAAGTTTAG 


ATATCAACAT 


CCTATAAGCT 


TCTAAATCTT 


TTTTCTGGTA 


61980 


ATACCGCAAT 


AAACTTCGCC 


TTTGCCCTAC 


TAACTTTAAC 


AAGCCTCTTT 


TTGAACTATG 


62040 


ATCTTTTTTA 


TTTATCTTTA 


AATGTTCAGT 


TAAATACTTT 


ATTCTACCTG 


TAATAAGTGC 


62100 


AATCTGAACC 


CCAACAGAAC 


CAGTATCACT 


TTCATTTTTT 


CCAAATTCAG 


AAACTATTTT 


62160 


TTGCTTTTGC 


TTTTTATCTA 


TCATAAAGCA 


ACTCCTA^AC 


CATTATAGCA 


AAGCTCTAAC 


62220 


AAACCTCTTG 


CCATAATTTA 


AAATAACTAC 


TACGATAGAT 


TATAATATTT 


TTTCTTAAAA 


62280 


ATAACAAAAG 


CAATTTATCC 


TTTTCGGGTT 


ATTTTTAATA 


ATAAATTATT 


AAATTGTTTA 


62340 


AAAAAACAAA 


TATAAAATTA 


AAATGTCATA 


ATATATTTAT 


AATTAAAATA TAATGAC ATA 


62400 


TTTATATTTA 


TTCAAACCCA 


CTCCTTGAAT 


TACTGCTAAT ATTTTCTCTT 


CTCTGGATTT 


62460 


TAAAATTTTA 


AATTCATTAA 


TATTGATTTC 


AATTTCAAAA 


TAAACACCAT 


TTTTAACAAG 


62520 


ATTTATCTTA 


TTAGAATCAA 


TGTAAACCTT 


TTCAAAACTT 


TTTAAAGATT 


CTAGACTAAT 


62580 


TAAGGAAGCT 


TTACTCAAAT 


TTTCACACAA 


TGTGGAATCT 


TTTAATCTAA 


ACATACCTAC 


62640 




TTTAAATTAC 


TAACATACGC 


GCAAGAATTT 


AGAGAATATG 


CCAAATCTCT 


62700 


TGCAATACTC 


CTAATATAAG 


TACCTTTTGA 


ACAGCTAATT 


TTCAAACTAA 


GCAAAGAAGA 


62760 


ACTAAAATCA 


TAACTTAATC 


TTTGAATATT 


ATAAACAGTG 


ACTTTTCGTT 


TTTTAATTTC 


62820 


AAAAAACTTT 


CCATTCAAAG 


CAAGTTTATA 


GGCTCTGCTG 


CCATCAATAT 


GAACAGAAGA 


62880 
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AAATCTAGGA 


GGACTTTGAT 


AAATCTCTCC 


TACAAAATCT 


TTAAGCTTTA 


AATCTATATC 


62940 


CTCTACATTA 


GGAATATAAT 


CTGTTTTACT 


AACTATTCTT 


CCATTCGGAT 


CAAGGGTATC 


63000 


TGTTTC T AAT 


CCAAATCTGA 


ATTCTGCtAC 


ATACTCTTTA 


TCTAAAGAAG 


TAAAATAACC 


63060 


TGAAAGC TTT 


GTGTATTTTC 


CCACAAGACA 


AACCAAAATT 


CCACTTGCAA 


ATTTATCAAG 


63120 


TGTGCCAGCA 


TGCCCAACAC 


GATTTGTATT 


AAAATATTTT 


TTTATAGGGA 


AAAGAGTTTC 


63180 


AAAAGAAGTT 


TTACCTTGTT 


C TTT ATT AAT 


TAAAAGGAAT 


CCATTTTCCA 


AATTTAATTC 


63240 


TCTCTTGTAG 


TATTTAATCC 


TTCAATTAAC 


TTATTAACAT 


AAAATGATTT 


GGAAAGAGAA 


63300 


TCATCCTTAA 


CAAATAATAA 


TTTGGGAGTG 


CTTCTAACTT 


TAATTCGCTT 


AATAATTTGA 


63360 


CTTTGAATAA 


ATC CCTTAGC 


ATTATTTAAA 


GCTTTAACTG 


CATTGTCCAA 


AGAAGCACCT 


63420 


TCCTTAATAG 


AGCCCATAAA 


CACTTTAGCA 


TTTATTAAAT 


CTTTTGAAAA 


TTCTACTTTA 


63480 


ACCACGGTTA 


AAAATGAATG 


AATTCTGGGA 


TCTTTAATCC 


CCCCACTTAC 


TATTAAATTG 


63540 


GCGATTTCTT 


GAGCAATAAA ACTTTCAAGT. TTAAACTTTT TAATATTCTT ATA&ATAAAC 


63600 


ACATATAAAT 


AAAACAATAC 


TAAGTTTTAA 


AAGATTTTTT 


AACCTTTTTT 


ACCTCAAATG 


63660 


CTTCAATTAT 


ATC TCCTTCT 


TTAATATTAG 


CATAATTATC 


AATCATAATA 


CCACACTCAT 


63720 


ATTGCTCAGC 


AACTTCTTTA 


ACATCATCTT 


TAAATCGCTT 


TAAAGATGAA 


ATTTTGCCGG 


63780 


AATGAATCTG 


TAAACCATCT 


CTCATTACAT 


TAGTAATCGC 


ATCTCGCTTT 


ATTAGCCCCC 


63840 


GAGAAACATA 


ACAACCGGCT 


ATTACCCCTA 


TTTTAGGAAC 


ATTTATTACA 


GCTCTCACTT 


63900 


CAGCAAAGCC 


AATAAACTGC 


TGCTCAACAT 


CTGGCTCAAG 


CATTCCTTCA 


AGAACTGACC 


63960 


TAACATCATT 


TATAGCATCA 


TAAATAACAT 


TGTACTTTCT 


AATCTCAACT 


TTTTCCTGAT 


64020 


CTGCTAGTAC 


CTGAGCTTTT 


GCAGTAGGCC 


TTACATGAAA 


TCCAATAACA 


ATAGCATCGC 


64080 


TTGCTGAAGC 


AAAGCTAATA 


TCTGTTTCGG 


TTATTACCCC 


TGCTGATGAA 


TGCACAACTC 


64140 


TTACTCGAAC 


CTCATCGTTT 


GTTAATTTTT 


CAAGAGAATT 


CTTTAAAGCT 


TCCACTGAGC 


64200 


CTTGAACATC 


TGC TTTT AAA 


ATTATTTTAA 


GCTCTTTAAG 


CGCTCCTTCT 


TTAATTGAAT 


64260 


CATAAAGATT 


CAACATAGTA 


ACTTTCTTTA 


CATTTTTGGA 


AGATTCATAT 


TTTTTAAGAT 


64320 


CTTGTCTTTT 


AGAACTGATC 


AATTTTGCTT 


CTTTTTCAGT 


TTTAGTTACT 


TGAAAAGGAT 


64380 


CCCCGGCTTG 


AGGCATTGAA 


GAAAATCCTA 


AAACACTAAT 


GGCTTTAGCG 


GGTCCAACGC 


64440 


TCTTAACAGA 


AACACCCTTT 


TCGCTAATTA 


ATGCCTTAAC 


TTTACCATAG 


CACGCTCCAC 


64500 


CCACAAAAGA 


ATCTCCCACA 


TAAAGCGTTC 


CATCCTCAAT 


AATAACAGAA 


CAAACTATTC 


64560 


CGCGCCCCAA 


ATCAATCTTG 


GCATCAAGCA 


CTTTTCCAAT 


AGCTCTTTTG 


GATGGATTTG 


64620 


CCTTTAACAA 


CATCATATCT 


GACTGTAAAA 


GAATCATATC 


AAGTAGTTCA 


GAAATTCCTA 


64680 
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TATTTTTAAG 


AGCAGAAATC 


ATCACAAAAA 


TAGTATCTCC 


CCCCCAATCC 


TCAGATACTA 


64740 


AACCGTATTC 


TGAAAGCTGG TGTTTAATCT 'fATCGGGATT 


TGAATCTGGT AAATGAATCT 


64800 


TATTTATAGC 


AACAATAATT 


GGAACATTTG 


CCTCTTTTGC 


ATGATTGATA 


GCCTCAATGG 


64860 


TTTGGGGCAT 


AACACCATCA 


ATTGCTGACA 


CAACAAGAAC 


AACAATATCT 


GTAACTTGAG 


64920 


CCCCACGACT 


TCTCATCATA 


GTAAAAGCTT 


CATGACCAGG 


AGTATCTAAA 


AATGTTATTT 


64980 


CTCGATCATT 


ATAAACAATA 


GTATAAGCTC 


CAATATGCTG 


AGTAATACCA 


CCGGACTCTG 


65040 


TTTGATTTAT 


ATGTATATTT 


TGAAGCACAG 


AAAGTAGTTT 


GGTTTTGCCA 


TGATCAACAT 


65100 


GACCCATTAT 


TGTAATAACA 


GGAGGCTTTT 


CAACTCTTTT 


GCTTTGATCT 


TCCACTTCTT 


65160 


CTTCTATAAC 


CGTTTCATCA 


TAAATAGAGA 


CAACATTAAC 


TTTTGAACCA 


TATTCTTCAA 


65220 


CTAAAATAGT TGCAGTATCA GAATCTATCT TTTCATTAAT 


AGTAACCATT 


ACGCCCAAAG 


65280 


CCATTAATTT 


AGCAATCAAA 


TCAGAAGATT 


TTAAATTCAT 


CTTTCTTGCA 


AGATCAGAAA 


65340 


CAGTAATGCT 


ACCCATAATG 


TCAATTGACT 


TTGGAATAGG 


GTTGGCTAAA 


TTTTCTCTCT 


65400 


TCTTTTTCTG 


AAGTTGTTCA 


AAAACTTTTT 


GTTCAATTGT 


TTTGCTCTCA 


GTTTCTGCTT 


65460 




ATAGCTTTTT 


TGACTCTCTT 


GTTGCTGTTT 


TTTTTTCTCG 


CCAAGCTTAC 


65520 


GATTTAACTC 


TTTACTATTC 


TCAGAATCCG 


CTGCAGGTGT 


GCTGCTAACA 


ATAGCGGGAA 


65580 


CTTTAGTTTT 


TATAAGTCTT 


CTAAAAGACA 


TAGAAGTAGT 


AGTATATTTA 


TTTTGAGAAT 


65640 


TATTTTTGGC 


AACATATGTT 


TTCTTTACTG 


AACCTTGATA 


TTGAAAGGAT 


AAGCTGTCTC 


65700 


TGTTTTGTGA 


ATATCCACCA 


GTTCTGTTAT 


CTCTGTTTTG 


TGAATATCC A 


CCAGTTCTAT 


65760 


TGTCTCTGTT 


TTGTGAATAT 


CCACCAGTTC 


TATTGTCCCT 


GTTTTGTGAA 


TATCCACCAG 


65820 


TTCTATTGTC 


CCTGCTTTGT 


GAATATCCAC 


CAGTTCTGTT 


GTCTCTGCCT 


TGTGAATATC 


65880 


CACCTCTATT 


ATCCCTATTT 


TGTGAATACC 


CACCAGTTCT 


GTTGTCTCTG 


TTTTGTGAAT 


65940 


ACCCACCAGT 


TCTGTTGTCT 


CTGTTTTGTG 


AATACCCACC 


AGTTCTGTTG 


TCTCTGTTTT 


66000 


GGGAATATCC 


ACCAGCTCTA 


TTGTCCCTAT 


TTTGTGAATA 


TCCACCAGCT 


CTATTGTCCC 


66060 


TATTTTGTGA 


ATACCCGCCA 


GTTCTATTGT 


CTCTACTTTG 


CGAATATTCA 


GCCTTATTGC 


66120 


TGTTATTATG 


CAAATCAACA 


AAGCTATTTG 


AATCATTTTT 


AACGCTTAAA 


TCATTATATG 


66180 


TTACAATTTT 


TACTACCTTC 


TTTTTCAACT 


TAATAATCTT 


AACTTTTTTG 


CCATCTTCAT 


66240 


TTTTAATATC 


ATCAATATTT 


TTCGACAAAC 


CTACTCCTCC 


TCAAACTCAA 


AACTAAGCCC 


66300 


TATTTTACAA 


CCCGGACAGG 


AGGTCATATT 


TTCATTAATA 


ACAACACCGC 


ATTCAGGACA 


66360 


AAGAAGCTCT TCATCTTCTT 


CTACCTTTTC 


CATAGACTCA 


TCATTGTCAT 


TAGCAATTAT 


66420 
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TATCATCCCC TCTTTTAATA 
ATTAAAAAGC ACTCCCTCAT 
TGATAAATTA GAAATCACAG 
TTCAAATTGC TCCTCCTCAA 
TTGCTTAAAC TCCGAATTAG 
AGCCCAGTCA AGAAGTCTAT 
AGAAAGCTGG TCATCACTAA 
AACATGTTCT ATC TTTGAAG 
ATAGGGAATA ATATCAATTT 
TCCTTTTTGT CCTATACAAG 
GACTTTGATT CTGTAACCAG 
AATTTCTGGA ATTTCAAGCG 
AATAACTTCA ATACCATTTT 
ATTAAGGTTA TAAACTTCTC 
ACCAAGATTA ACATAAAGAT 
ATTCAACTTG CTTTTAAATT 
TTTGGTTCTT TGTTTTGCAA 
AATGTAAGCA TAATCACCTT 
AAGCAAAGAA TCTTTTACCT 
CGTATCATCA TCAAACTTAA 
TATTAATACT GATTCTTTAA 
TGCAATATTT ACAATCATAT 
ATCTAGCCTT TTTAACATCA 
TAAAACTTTT TGGCTTTGAC 
ACTTAATCTT TTTACCTTCA 
TTCCTGGAGT AGAAAGCTCT 
AAATCATTTT ATGCAAATCA 
GAACTATTTG AATTTTTCCA 
CATTTAACCG CCCTGTTAAA 
ACTTAATCAA CTGTATTCCT 
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TTTTGTTAAT TTCTTCTTGT 
CTGCTTGTAA AAAATTGTTA 
AAGGATCAAG CAATTTAAGA 
CAACATCTTG CATAACTTTA 
CCTTCATTTC TGCAAATTGA 
TAGCAAGTCT AACATTTTGA 
CAACCACTAA AGCTTTATGT 
GAGTCAAAGA ATCCTTTATA 
TTTCTCCTTC AAGTTCCTTA 
GACCAACAGG ATCAATCTCT 
GATCGCGAAC TATTTTATGA 
CTAAAAGCTC TTCAATAAAC 
TACCCTTTTT GACATTATAA 
TTGGCGATTG ATATTTCTTG 
CACCATTTCT ATTTTGTTGA 
CTGATAAAAT CTCATTATCC 
CCTGAATAGA AAGCCTATCA 
CTACAATATT TTCCTTTGAG 
CTTTTACAAT TTTCTTTTTT 
TAAAAGCATT CTCATTGCTT 
TTGTTTTTCT AATAGAATCT 
GCCCCGTGCC CTTTATCATC 
CTATAAAAAA CATTTACTTC 
TCTAATATAA AACCGTCTTC 
AAAATTTTAA ACTCTCTGTC 
AAAGTAAAAC CATATTTAAG 
GTCAAAAAAT CAATATCTAA 
TTATTTTTAT TTCTAAAGAT 
TCTTTTATCA AATTAAAAAC 
AAAAATAATA AGGTTCTTTT 
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- # 

TTTTC ATAAC^TAC ACCAAG 66480 

ATATCATCAA ACCCCTCTTT 66540 

TCACTTATTT TACTAATCTC 66600 

TCAAACATTT CGAGTGTTTC " 666 60 

CTGCTAGTTT TAACATCAAT 6672 0 

CCCATTTTAC CTATAGCAAG t 66780 

AAATCCTCGT CAAGAATATA 66840 

AATTCTTTAA TATCTTTACT 66900 

ATTATAGATT GAATTCTGAC 66960 

TCTTTTTCAG AATAAACAGC 67020 

ATCTTAATAA TACCTTCTTC 67080 

TTTGGATGGG TCCTAGAAAG , . 67140 

ACTAAAACTC TAATCTTATC 67200 

GGAATTATAC CATCCGTATT 67260 

ACGTACCCAA TAACAACCTT 6732 0 

TCAATTCCTT GCAGGTCATT 673 80 

AAAACTTTGG GATTAATTTC 67440 

ATATCTTTTT CTAATATTTC 67500 

GCATAAACAG ACAAATCTCC 67560 

CCAAAATACT TCTTATAAGC 6762 0 

ATACTCATGC CACGATCATT 67 680 

CAAACTTCCT C CTTAAACTA 67740 

TTTGCTATCT GTTTTAAAAA 67800 

AAATTCATTA TCCAACATCA 67 8 60 

ACTTTTTATT TTTCTATGTA 67920 

ATTTGCTTCT AAAATTAATA 67980 

GGAAAAATTT TTACTATAAA 68040 

ATTAATTTCT AATATCTCAA 68100 

TTCATTATTT TTGTCAAAAT 68160 

AAAGAACCTT ATTAAATACA . 68220 
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TAAACTAAAC CTTAAGTCAA GGTTAACACT TAGCAAAAAT AATGTCAACA TTAAACCAAA 68280 

TTATAAGATT TGGCCAAAGA AAGCTTTATC ACCTTATAAG CTCCCAATTG CATAATAAGA 68340 

TC TTC AC AAA TGCACATAGA TGCTCCTGTG GTAACAATAT CATCAAGTAA AACAATCTTT 68400 

TTAAACTGAA AATTTTTATA TTTTGATCTT AATTTAATCT TATTTTCAAG ATTTTTAAAT 684 60 

CTAAGATTCC CTTTCATTAA CTTCTGGCTT TTTCCATACT TTCTTGAAAA AATATTTATA 68520 

TAATTAAAAC CAAAACGGCT TAACAAAATA CCAATGTATT CCATATGATC AAAACCATAA 68580 

AATAATTTTC TTTTAAAACT ACAAGGAACA GTTACTATTT GATCAAAATC AATATTATTT 68640 

AAACATTCAG CAATTCCACT TGCCAAAAAT CTACCAATTG ACTTTTGAGC ATCCCTTTTA 68700 

TAAGACAAAA TTAAAGATTT GTAATGCTCT TTATATTCAA AAAAATAAAT CAAATTCTCA 68760 

TCAAATTTAA TGTTAAAATT AAAAAGTGAC TTACATTGGT CACAAAGAGC ATTAGAAGAT 68820 

ACATACCTTT TTCCACAAAA GACACAAAAA GGCAAAAATA TACTCTTTAA AACATTTAAA 68880 

TAGCTCATAC TAACTGGACT GAGAAATAAC CTTTAAAACA ATTTGATTTA ACAGCTCAAT 68940 

TGATTGAAAA GGAGTAATAT TATTAATATC TATGTTAGAA ATAAAATTTT TTAACTCTAA 69000 

ATACTCATTT AATTTAATAT GAATATCAGT GTCATTTTTC AAGATCTCTT TATCATTACC 69060 

ATCAGAAGAA ACATGGGGAA GAAACTCTAA ACAAGAGTTG CCCTCTCGGC CCACCAAACT 69120 

TTCTAGAATA ACATTAGCTC TATCTATTAC CCTTAAGGGA AGTCCTGCTA TGCGAGCAAC . 69180 

ATAAATACCA TAAGAATTAA GAGATGGCTT TTCTTCAACT TCTCTTAAGA AAACAAGATC 69240 

GTTGCCCTGC TTTTCAATTT TCATTGAAAG ATTAATAAAA GCCTGATGAT TAATAGACGA , 69300 

CAATTCATGA AAATGTGTGG CAAACAAACT TCTAGCTTTA ATATACTCTA AAATATACTC 693 60 

TATAATAGAA TAAGCAATAG CAAGCCCATC AT.TTGTGCTA GTACCTCTTC CAACTTCATC 69420 

CATAATTATT AAACTC TTTT CTGTTGCATT CCTTAAAATG TTGGCTGTTT CATTCATTTC 69480 

AACTAAAAAA GTGGATTCCC CTTTGGCAAT GTTATCACTT GCTCCAATCC TGCAAAAAAT 69540 

TTTATCTGTA ATACCTATTA AAGCTTTAGA AGCTGGCACA AAAGAGCCTA TATGCGCCAT 69600 

TAAAGTAATT AAAGCGACCT GACGCAAATA GGTTGATTTA CCTGCCATAT TAGGTCCAGT 69660 

AATTAAACAA AAATACTTTT CTTTATTAAT TCTTACAAAA TTTTCAGTAA AGATTTCAGT 69720 

ATTTTTAGTG TAGTGCTCAA CAACAGGATG CCGAGACTTT TCAAGAAGAA TTTCTTTACC 69780 

AGATGTCAAT ACAGGCCTTT TATATTCATT TTTTTTTGCC AAATAACCAA AGTTAACAAC 69840 

TAAATCAATA TATGCAAAAA ATTCTGCAAC CTTTTTAAGA ACTTTATTAT GCATAACAAC 69900 

ATTTGATGCT ATTTCATCAA AAATTTCCTG TTCAAAAGCA ACCACATTAT CTTCAGCATT 69960 
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ATTAATATCC 


ACCTCAAGAG 


7?AATAAGTTT 


TTCTGTTTTA 


TATCTTTTTG 


AAGAATTTAA 


70020 


AGCTTGGCTT 


TCCATAAAAT 


GTGGTGGCAC 


TTGAGCATAA 


TTACTCTTTG 


TAACTTCAAA 


70080 


AAATAACCCC 


CTATTATTAG 


TTTTTCTAAT 


CTTTAGGTTA 


TTAATCTTGC 


TAAGCAATCT 


70140 


CTCTGATTCA 


AGATATTGAT 


CAATATATTT 


ATTTGCATTA 


ATCTTTAAAT 


CTTTTAAGTT 


70200 


ATCAAGCTTT 


AAGTCATAAC 


CTCTTTTAAT 


AAGTTCATCA 


GGTGCACTTG 


AAATTGCACT 


70260 


ATTTATCAAA 


AAATAAACTT 


TAGAAATACT 


ATCCTCTTCA 


AATTTATCAA 


AATTCCAATA 


70320 


ATCAAAATTA 


TGCTTGTCAA 


ATAACTTTTT 


TACCGTAAAA 


AATACAGAAA 


GAGCTTTTTC 


70380 


AATAAATAAA 


AAATCTTTTT 


TAATATATCT 


TTTCATTTGA 


ATCCTAGATA 


TTATTCTCTC 


70440 


AATATCCCAT 


ATATTAATAA 


AAGTTTCTCT 


TAAAGTCACA 


GTCAAGCTAA 


TATTTTTGCA 


70500 


AAAAAATTCA 


ACATGATCTA 


GCCTGGTATT 


AATCTCAGAA 


ATATTTAAAA 


TTGGATTTAA 


70560 


AATAAATTCT 


CTTAAAAGTC 


TCTTTCCCAT 


TGCAGTTTTG 


CAATCATTTA 


ACACAGAATA 


70620 


TAATGAATAT 


TGAGAAG AAA 


AATCATTATT 


ATTTTTTACA 


AGTTCAAGAT 


TAACTTGAGT 


70680 


TACGTCATCA 


AGAAACATGT 


ACGAAGAATC 


ATTATTGATA 


TCTATTTTAT 


CAATATTACT 


70740 


TAATAAATTT 


TTTAAATTAT 


TTTTTATATG 


ATTTATAATA 


AGAAAAATTG 


AAATGTAATA 


70800 


GGGCTTTTCC 


TCATCAAATC 


CAAGAGAGCT 


CAATCCAAGT 


ATGTTAAAAT 


GCTCCTTTAT 


70860 


TGTTTTTATT 


GCAATATCCT 


TATCAAGATG 


CCAAGTAGGA 


ACTCTGTTAA 


TTAAAAATCT 


70920 


ACTAAGATTA 


AGCTTCTCTG 


AGTATTCATA 


ATAAAAATTT 


TCAGAAACTA 


TTATCTCTTT 


70980 


AGGAGAGTAT 


TTCTCAAGAT 


CCCTTTTAAG 


TTTTTCAAAA 


AAACCATTCT 


CATAAAACAT 


71040 


TATTCCAAGA 


CTGGAAGTAG 


ATAAATCTAT 


ATAAGAAAAC 


GAATAATAAT 


CTTTATAATC 


71100 


ACTAATAGCA 


ACTAAATAGT 


TATTAATATC 


ATCATTTAAA 


AAATCTTCAT 


CAATAATAAC 


71160 


GCCTGGGGTT 


ATTACCTCAA 


CAACCTCTCT 


TTCTAAAGGC 


CCCCCAGAAG 


TAGAATTGGA 


71220 


CGCTTGTTCA 


CAAATTGCAA 


CCTTTTTATC 


AAATAAAATT 


AATTTCCTTA 


TATATTCTTT 


71280 


ACTGGTATGA 


TAAGGAACCC 


CACACATTGG 


AACATTTTCT 


CTTTTTGTCA 


ACGTTAAATT 


71340 


AAGAAGCTTG 


CTTACCTCAA 


TTGCATCATC 


AAAAAACATT 


TCATAAAAAC 


TTCCTACTCT 


71400 


GAAAAAAAGA 


ACAGCATCTT 


TATATTTTTT 


CTTGATATCT 


AAATACTGCC 


TTATCATTGG 


71460 


GGTAACATTT 


TTTTCCATAT 


GCTTCCTAAA 


TAATATTGAA 


TTACAATTGA 


TATTATT^AAA 


71520 


TAAATATAAT 


TCAATTAAAA 


AGAAAGAATA 


TAAAATAATA 


AAAAGACCAT 


AAAAAAAATA 


71580 


TTTTACGCAA 


TTAAACGCTA 


TTTAATTATT 


AAAAAGCCTA 


ATGTTTTAAA 


TTTAATTAAC 


71640 


TTTAAGGGTT 


TTTATTGTCC 


TTTTCTAAAA 


GATGCTTAAC 


AACATCGTTT 


GTTATATCAA , 


71700 


CAGTGCTATT 


GTGGTAAAGA 


ATATATGGAT 


TATTTTTTTT 


CATAATCAAA 


GAAAACCCAT 


71760 
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TAATTTCTGC AACATATTGA ATACCACGAA GTATTTTACT 
TTAAACTATT AATATTGGCC AATCTCTGC5T. GTTCTAAATT 
CTCTCTTTAG CTCATCAACT TTCAAATTAT ATTGATTTCC 
AATCATTATC AGCAATCGAT TTATCATACA TATGTTTTAA 
ACGTATTTAT TTGTTCTTGA TACCGATTTT TTATTTGATC 
GATTTAAAAC TTCAATCACA ATTCTATCAA AATCAACAAT 
TCGAAAAAAC ATTAAAAGAT AATAAAAAAA ACAATGGTAA 
TCTCCATAGC ATTAATCAAT ATCTCATCTC AATTCC TAAG 
ATATTTGTAA TAGCTGTTAA CTTTATCATT GTCAAAATAA 
AGACAGCGGC AATTGAGGTA AAAGACTTCT AATTCCAGTT 
AC T AAAAGGT CTAAACAAAG AATTTTCTTG CCCTTCTAAA 
AAAAAAAGCA TCCCAAACTA AAATATTTTT TAACAAAGGA 
TACAAAAGAA CTGTAAATAT TTTTCAAAAT CCCCCAACCT 
AC T AAGAATT ATGTGGTGAT GGGGTTGAAT TTCAATTTCA 
TAATATATTT GAATAGACAC TCCTTAAGGT CAAAATAATA 
ATCCTCATAT CCCAAAAGAG AAAAATATCT CTCAAAAGTT 
GCTCTGACCA AATAAAAATC CACCAAAAAA ATCAAACTGT 
ATTAGATAAA GAGGTAGAAT' TTCTTGTATC CCAAGCCGCG 
AAATCTAAAA GTTTTATAAT TGTCTCTTAA ATAATAATTT 
ATCATAAAAA ACATATTTTA AAGCAGTTTG CAAAGTGCCA 
, ATAATTAGAA AAAGTATACC CGGTAAACGC TCCAAAACTA 
CATAGCATTA AAATCGGAAA AGCTTTTAGC ATCTCGATAT 
ATCAGGAACT TCCCTCTTGC CAGAAAAAAT AGGCCCATTA 
AACGGAATGT GAAAAATCTA TAAATCCACC TACGGTCCAT 
ATCTCTAAAT GTCAAACTAA GACTTTGCTC TAAAAAAGAT 
ATAATAGCCT TCGCCTAAAA AATTAGAAAG CTCCCACTGC 
TGAAGAATTT GAATTGCCTC CAAAATTCAT ACCAAATCCA 
CTCAATGTTT AAATTTATTT TCATAAGCCC TTCTGTATTG 
TACATTTGAA AAATAACCAA GCTGCTGTAA ATTTGCCATA 
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m 

TAAAGATTCA CTATTATTAT 71820 

ATTCTTTGCT AAACTAGACA .71880 

AAAAGATCTT GCATTATCTA 71940 

ATTC TTAAGC TCTAAATTTA 72000 

AAGATTAGCT TTCAATTGAG .72060 

TCCTATCTTA ATAAC ATCTA 72120 

AATCAAAAAA AACACAAATT 72180 

AAAAATTTAA ATCCAGAATA 72240 

AAAGGATAAG CTATTACAAA 723 00 

CCCCAGCTAA. AAGCAAAACT 723 60 

GAATAGGAAG CAAAATCTAT 72420 

ATAGATATCT GCACAGTATT 72480 

CTAGCCTGCA TAAAATTTTC 72540 

AAACC ATTAC CAAGAGGAGG 72 600 

TCAAAATAAG GAGTAAACAC 72660 

GTAGAAGATT TAATAAAATG 72720 

TGCTTAAGTA AAAATCCATT 727 80 

CTCAAACTAA GAGAATTTTC 72840 

GAAGGTCTGT TAACCTCATT 72900 

AGAAGGGTTT GTTTGCCAAG 72 960 

AGTTTaAGC A AAGAATAATT 73020 

TCTTCCCAAC TTGTAAATGG 73080 

ATATCCTGAT AAGCAGTATT 73140 

CTTTTTTGAA AAAACCAATT 73200 

AAATTTAGTC TTGCTGCAAA 73260 

CCAAATACTG AGAATGGAAA 73320 

AAATTACTTG TTGCTCGCTC 73380 

CCTGGAACAA TATCAGGAAT 73440 

CCCATCTTAA ACTTGTCCAA 73500 
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ACTAAAAACA TCTCCCTCTT^KAGAGGAAT CTCTCTAAGT ATTACATGCG^BCgCTGTATT 73560 

TTTATTTTTA GAAACAGTAA TAGACTCAAT ATGAGCTTTA TCCTTTTCTA AAATTTTAAT 73620 

TAACAAATCA ACAAATTCCC CTCTTATCTT TTGCGAAGGA ATAATTTCTG TAAAAATATA 73680 

CCCTTCTCTA AAATAACTTT CCTTAATTTT GACAAAATCC TGCTCAAATT TAGAATCATT 73740 

AAAAATATCA CCTTCGCTAA AGGTAATAAA AC TTTTTAAT TCTTCCAAAC TAAAAACTGA 73 800 

ATTACCAGAA ATTTCAAGCT TTCGAAATCT AAAAACATTG CCTTCTGAAA GAAAATATTT 738 60 

CAAAAAAACT TCCTTTTCTA GTCTTTTAGA ATCTTTAAGG GAATCTTTAA TATCAACAGT 73 920 

GCTATTGATA ATCTTAACAT CAATATATCC ATTATTTTTA TAAAAAG AC T CTAATTGACG 739 80 

CTTGTCTTTA TCAACATTAC TTTTTAAATA TTTACCATCT GAGAAAAGAG ACAC TACTCT 74040 

TGATGCTAAA GATTTCCTCA AGGTACTGCT TTTAAAGCTT AAATTTCCTT CAAAGTCAAT 74100 

CCCCTTAACA ACATATTTGG GTCCAGCTAC TATATTAAAA ATAATATCAA CTAAATTTCC 74160 

TTCTTCTTTG ATTTCAAAAT TTGCAGAAAC CTCAAGATAT CCCATGTCTT TATACATCTC 74220 

TTCAAGCTTG CCAATACCTT TATTAACACT TGCAAGATTT AAAGGCTCAT TGGTTTTAAT 74280 

ATTCACCTTC TCAACAAGTT CGC TATTCC A AAAAACTCTA CTGCTATCAG AAAAAACAAC 74340 

AGAATTAACT AAAGATTTTT CTTTTACAAT AAATGTAATA AAAAGATCCT CACCATCTAT 74400 

TTTAAATATA GGC TTAATAA GCCCAGAAAA ATAATCAAGA GAATAAAGAT CAATTTGCAA 74460 

TTTATCAAAA ATTTCATTAG AATATGACAC GCCAATGTAA GGTTTTAAAA TATTAATAAA 74520 

ATCTCTCTCC TTCTTATTCT TAAGTCCTTC AAAATTAATA CCCTTTATTA TTTTCCCCTT 7 4580 

GTAATTTTCA ACTTGACCAA AACTAAAAAC AACAAAAAAT ATTAAAAAAC TTACAAAAAA 74 640 

CAAACCTCTA ATTGAACCCA TCTTAACCTC TTAACAATTT AATATTTAAA TTTCCAAGAA 747 00 

ATGCCTATAT TATTTCCTAT TCCATCTAAA CCTTTTTTCA TAAAATTGTA ATCAAACTCA 747 60 

TAATTAACCA AAAAAAATGG AGAATCAAAC TCAATACCCa AATTGACAAC aAAATTTAAA 74820 

TCTTTTGAAA AAGGAGACAT TTGTTCTTTC AAAAAACCAA AGCCCCCACT AATAAAAACA 74880 

CCCTCAACAA GATATTTGCC TACCTTAACA CTTGTATTGT CAAGAACATC AACAAAAGTA 74940 

GGATTCCCGA TTTTGAAAAA ATTACTATTA ATAGAATTTT TCAATATATC TGTCTTTATA 7 5000 

CTCAACAAGT CTAAGTTTAA TACAGAACGC ATATAATCTT CAATGGGTTG AATTAAAAAA 75060 

TCAAGAGCAA TGTCACTTAC TATTCCAATT GCCATTTCAG CAGCATTAGT CCCTGCCGAT 75120 

CGCAATCCTC CCTCATACCC TCCTATTGTT GAGCCTGAAA GCAAATATTT AATTTCCTGC 7 5180 

TC ATTTC TAG AAGGATAAGA CATAAACTCA ATTTTCCATA AACTTAAAGG ACTATCAATG 7 5240 

CTTATTGTAA CAAGCAGTTT ATCATTTCTA TCCTTAATAG TATTTGTAGC CTCCGCTTTT 75300 
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ATCCATGGAT CAAATTTAGC CCTACTCTCA TTAAAGGATA 
ATAAATTTTT TATTATTATA ATTAACAAAA CCACTTGCAA 
ATAAAATCAT CTGTTTTTGT GTCAGATTTT ATTACAAGTT 
GCTTGTAAAA AAGAAATATT ACTATCTGGC CAAtGAAAAG 
ATCTCAAGAT CAGTTAAAAT ATCAAAATCT AGAAGATTAA 
GCTCGTTTGA AAGGATTTAT TAATAAATCA ACAACAGAAC 
GCATTTGAAA TATTTAAAAT GCCTTTAAAC ATAATTTCAT 
AAATCGCCTA AAGCATAGCC TGTAAAGCTT AAGGCAATTT 
CCCGTTCTAC CAGTCACATT AATATCTATT TTGTAATAAT 
AAATTTAAAT TTAAACTAGT AGAAACAAAA ATCTTGGAAT 
TTACTAAAAA TGATTTTATT ATCCTGAATT GCAACTGGCA 
CTATCACCAC CAAATTTTCT AGAAGCTCTT AAATACTCAG 
ATATTTAAAC TTCCATTAAT ATTAGGATTA TACAAATCCC 
TTTAAAACAA GATCGTAAAG AATAAAATGA TTGTCAACAT 
AAAAAATCTT TGGTGATTAT TTTTGAATCA AATTTAATAT 
TTATTTTTAA TTATTTTGCC TGACGAATTA AAACTAAGGG 
AAGCTATATT CCCCACTGCC ATAATGATAT AAAACATTAA 
GC CAT ATT AA ATTTTTCAAA ATCATTGCTA AATTCAATCG 
TTTTTATACT TAATGTCTCT AAAGAAAAAT TCGCCCTCTG 
AATTGCTCTT CAACTCCACT TTGGAAATTT C T AAAATTTG 
AAGTCGCTAT TAAAAACTAT ATTAGATACC CCTATAGAAC 
CTTCCGGTAA GAAACTC TTT TTTATTGCGC TTGGCTTTTA 
TTATCCAATA ATCCTAAATT CATAGAAAAC TGAACGGGAA 
TmTGCCTCAA GATATCCACT CAAAGAATAA TTTTTAAAAT 
AAAAAATTAC CATTAACCTT TCCCTCTAAA AGGGAGAAAG 
AAGTCTTGAA ATTTAAGCAA AAAATAACTT CCATATTCAT 
TATCTCTCTG AATTTAAGCT AGAAAATAGA TTCAAACTTC 
TTAAATTGTC CATATCCTTG CAAATTTGAA AGTTTGTT^TA 
TACAATTTAT CATATTCATA TAAGCCCTTA AAACCAAAGT 



TATAAGAGCC 

TATTCAAATC 

TATCCCCTCT 

TAACACCGCT 

TATCAGTTTG 

TTTCAAGAGA 

CAGCATTTCC 

TTTCAAATTT 

CAATAATAGT 

ATCGATCTAG 

TATCAAATAT 

TGCTAATTGA 

CATCAATATC 

TAAATAAAGA 

CTCTCACATT 

GCAAATAATC 

CAAGGTCATA 

TAAGATTGGA 

TTTTGGTTTT 

CATTAAACCT 

TGCTTAGGTC 

TGTCATATAT 

CGCCCAAAAA 

CATTCTTTTT 

AATTAAAATT 

CGGAATTAAC 

CATTAAAATT 

TAAGTTTAAC 

TGAAATTATA 



GC TTTTAAAA 

TCCCTTAATA 

TGAAATAGTA 

GTCAAAATTT 

CAATCTTTTT 

ATAAACCCAA 

TTCAATTGAA 

AATAGGAACT 

ATCACTTAAA 

GTTAAACTCA 

TTCTAAAGCT 

TCCTTTTTGA 

GAATTCACCA 

ATGTGAATCC 

TCCAAGAATT 

TTTTAAAATA 

ATCAATTGAA 

TAAAGATTCA 

CAAAACTCCA 

GTAAGAAAAC 

ATATCTTAAG 

GTTAAGCTTA 

AGTTAATTTA 

AAAATTTAGC 

GTAAAAATCT 

CCCTAAAAAA 

ATCTTTTAAA 

ATCAGAAATG 

GGCAGGTGTT 



75360 

75420 

75480 

75540 

75600 

75660 

75720 

75780 

75840 

75900 

75960 

76020 

76080 

76140 

76200 

76260 

76320 

76380 

76440 

76500 

76560 

76620 

76680 

76740 

76800 

76860 

76920 

76980 

77040 
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lCTCGTCT ATTTTAAATT TCAAATCTTC? 
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TTAGAAACTT TTATTATATT^KACTCGTCT ATTTTAAATT TCAAATCTTC^K^TAAGGCCT 
AAATAGTTCC CATTAAAATT TATATTTATT AAAAGATCAG AGGTTTTATT GTAAACCTTA 
AAATCATTAA CATTAAGGGT ATAAACTACT TTAGAATCAA AATAATTAAA ATCTAATTTA 
ATACCAAGAG GAGATTCGGC AACAATAAAC TTATCTTTAA GATTAACATT AAAATGCAAA 
GGATAAACTT TATTCAAATA AGAAAAATTC GTGCTAATAT TAAACCCACT TTCAAATAGT 
TCAACAAACA AGTTAGAATG CAAATTGTGA TCATTGTACT GCAAATCAAA ATTGTTTGAT 
TTGTAAAAAT TTTTTTCTCC ACTGGCATCA AGCACAAATT TGAAATTGTC TAACTTTGAA 
AATACCATGA AATTGAATTT TTTTAATTTA TTTTTATGAT AATTAAATTT ATTAAAATTA 
AAATCAGAAA CCAAATTTAA ATATTTACGT GAAAATAAAG TCTTGGGAAA AAAATTTATC 
AAGTGAGAAC TGGGAATAAC TTC TTTTAAA AAAAGCAAAG GAAATTCTTT AATACCTAAG 
CTTAAAGAAA ATTCTTCATC ATTAAGATCT CCTTTTAAAG AAATTTGAGA GTTTTTATTT 
TCTAAATAAA TCAAATAGTC AACAAAAATT TTATCCTTTA AAAAATAAGT TTTTAAAATT 
AAATTTTGAA AACTAAGATT TCCAAGACTA AAATTATCAG ACTTAACCGA AAAAATATCT 
TTATCTTTAT TAAAATCTAA ATACCCATTT AGGTCGTTAA AGTTTAAAAT TTTTGCAGAC 
TTAAAGTTAA GACGTCCCAT TGGAAGCAAA TCTTTCAAAG AATAATAACC TTTATAGTTT 
ACAACCCCCC TTTTAAGCTT TAAAAAAGCA TTCTTAACAC TTACAATTCT ATCATCCCCC 
TTGATTTCAA GCTGCAAGCC TTGAACTTCT TTTCCTATAG TATCTACATT TAAAGATGAA 
TCTATTATTC CTGCATATCT TAAATCTTTG TCCTTAAAAT CATAAGAAAA TGCCAATTGC 
CCATTTAAAC TTATATCAAA ATAATCTTTA TAAATTTCAA AGCCTTTGTT TAGCTTAATC 
CAATCTAAAA GACTAACATT GAAAAATAAA GCATCTAATC GAACAAAACC ATTAGCCTTG 
TCATAACTTA AATTAAAATC AAAATTCTCT CTTCGTAAAT TAAAAATTTT TAAATTTCCT 
TTTGAATAAT TTATTTGGAA CCCCTGCTCA AGTAAAGAAA AATAACTTGT TTTAAATTCA 
AAAAAGCTAA AATTAACATA GCCATCTTCA AAGCCTTTTT TGAATTTCCC CTCAAAATAG 
AAAGTTGAAT CCAAAATTCC ATCATCAACT CTTTCAAAGG GTAAATTAAT TTCTAAATTT 
TTAACAGCAC TAAAATCAAC TACAGAGCTA AATAAAAAAT CTTCATCTAC GGTACTTAAG 
GAAAAATTTT TAACTTGAAA ATTAAGCCAA CTATTATCAT TAAGCTTGAT ATTAATATTG 
ATATTTTCTA AATTAATATT TAATCTATAA AGGTAGTTTA AAATTTTATT AAAAACTGTA 
TTTTCATTGT CAGAATAGGC ATTGCTiVGGA TTTAAATCGC CAGATAAACT AAAGTCGTTT 
ATATCAAAAT TGAAATTACT TCCTTTAACA TAAACATTTA AAATAATATT TTCATCACCC 
AAAATTAATT TAAACAGATT TAAATCTATC CTAACAATAT CTATTAATAT TTTATCTTTT 



77100 
77160 
77220 
77280 
77340 
77400 
77460 
77520 
77580 
77640 
77700 
77760 
77820 
77880 
77940 
78000 
78060 
78120 
78180 
78240 
78300 
78360 
78420 
78480 
78540 
78600 
78660 
78720 
78780 
78840 
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CCATCCAAGC 


TTAACTCTAA 


ACCGTCTATC 


TTGATTGATG 


ATAAGAAATA 


CGGTGAAATT 


78900 


TTATCATATT 


TAATCTTAAA 


GCCAAATTTT 


GATTCAAGAT 


ATTTTATAGC 


AAAAAACTTT 


78960 


GCAGAATAAA 


TTTGAGCTTG 


AACAAATAGA 


TTAATGGAAA 


AAATTATTAA 


AACAAAAATA 


79020 


AAAAATGGCA 


AAATCAACAA 


TATAAATGTC 


TTACTTCTCA 


AAAACAACAA 


ATTCATACAC 


79080 


TCTATCGATA 


ATTATTATTA 


TATAATAATT 


ATCGATAACC 


TAATTATTGA 


CACCAAAAGA 


79140 


AAGGAAGAAA 


AAATATTTGT 


GATTAAAATA 


TTGAAAAACT 


TTTATTGCAT 


AGAAGGAATT 


79200 


GATGGAAGCG 


GGAAAACAAG 


CATCACTAAT 


AAACTAAAAG 


CTCTTTGCAA 


CGATGAATCA 


79260 


AGGTATTATT 


TTACAAAAGA 


ACCATCAAGT 


GGAATAATTG 


GAGAAATGAT 


AAGAAAGCAA 


79320 


TTAATGAATT 


TTGAAAATCC 


TTTAGAAGAA 


TCAACATTTG 


CATATCTTTA 


TGCTGCAGAC 


79380 


CGACACGATC 


ATTTATATAA 


AAAAGGTGGA 


ATACTGGAAA 


TTTTAAACAC 


AAAATCTAGA 


79440 


AAAATAATAA 


CTGATCGCTA 


TTTATTCTCA 


TCGATTGCAT 


ATCAAGGAAA 


ATTAGGATAT 


79500 


GAATTAAATA 


AAAATTTCCC 


ATTGCCTGAA 


AAAGTATTCT 


TTATCGAAAC 


AGACCCAAAC 


79560 


ATAGCTTATG 


AAAGAATACA 


GAAAAATAGA 


ACACAAAGTG 


ATCTTTTTGA 


ACTTGAAAAA 


79620 


TATAAAACTT 


TTGAACAAAT 


TGCTCTAAAA 


TATTTAAAAA 


TATTTAAAAA 


ACTAGAAAAA 


79680 


AAAATTAATG 


TGATTTACAT 


CAACAATTCA 


ATAAAAGATA 


ATTTAGATAA 


AAACGCAAAA 


79740 


AAAATTTTCA 


ATCTAATAAA 


ATTCTAATAT 


AATTAATCAT 


ATGCATATTT 


TCAAAAATGT 


79800 


CCCCTTCCAA 


ATAAATTTAA 


TTTTATTTCT 


TTTAGTATCA 


GTTGCAAAGA 


TAAATGCATC 


79860 


GTCCAAATTT 


TATTACGCAG 


AACAATGGTA 


TGTAATTTTT 


AATTCTCAAA 


TGAAAAAAAA 


79920 


ACCTGAAAAC 


TATAAAAAAA 


ATATATTTTT 


TCTTCAAAAA 


GCCTTAAAAT 


ACCCATTTGG 


79980 


AAATCCAAAA 


TATTCTCTAA 


CTAAAATAGA 


AACCAAAGAA 


CAGTGGGAAA 


AATATAAACT 


80040 


TCTTTTCAAA 


ATGCATGTAA 


ACTTGCTTCT 


AGTTAGGCAA 


AATTTACATT 


TAGGAGATTT 


80100 


ATTCGACACA 


AGAAATTTAT 


ATTTTTTCAA 


AACTCCAGAA 


AAAGATGGAA 


TTATTTCCAA 


80160 


TCTAGAAAAA 


TCAAAAAAAT 


TATATAAACT 


AGCTATTAAT 


TACTACAGCG 


AAGCACTAAA 


80220 


AT AC C AC AAA 


AAACTTGAAA 


ATTACACAAC 


TGTTAAACTA 


GAAAACGATG 


GAATAACAAA 


80280 


CTGGGAAGAT 


GAATATCATA 


AAATTTCTCT 


TAAAGAGCTT 


AATTACTATG 


ACATTATTAA 


80340 


AAAAGAACTA 


CTAAGAATTG 


ACGAAACTAA 


AGCATTTTTT 


GAACAAGGGC 


C AAAC T ATT A 


OUflUU 


TTAAAAAAAC 


TCTTTGCCCT 


CTTTGGAAAA 


AAAAATTTTA 


TTTATATAAT 


CCTTATTTAA 


80460 


AGAAAACTTA AAAACAAGAT 


CTTTAAAATT 


ATCCTTACTC 


AAAATACTAT 


ATTCTGAGAA 


80520 


AAGAGTTATT 


AAGGCTCTTT 


CTGCTAAAAA 


AGGCAATTCT 


AAAATATTTC 


TTAAAATTTC 


80580 
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GGCTCTAAAT TCACGCGCTT 
ATCAGTGCTT AGTGCTACAA 
CAAAGCAAAT GGCTTTAAAA 
TGAACATCTT GTAGAGTAAT 
AATAAGCAAA GGATCAAAAT 
ATACGAAAAA AAAGATTCTT 
AATCTTATCT GAATATTCTA 
ATTATAAAGC AGTGTAGGTA 
AACAAAAATT AAACGTTCAT 
AGGATTAGAG GCTATTAAAT 
AATTTTATCT TTTCCACCGC 
CCTCTTGTCG AGAAAATTTC 
AATATTGCCC AATACCTTGA 
TGCAGCATAT TCTGTTTTTC 
TACCCTAAAA TCAAAATGAC 
AGCAAAGTCT AAACAAAAAC 
AATAAAAACA ACAAAAAACA 
TTAAAGCTAT TATTCAAAAG 
CTTTATTTTT AGTTTATTCT 
AAC TTTTTTA TAAATTGAAA 
AGATCCAAAA TATTTTTTTT 
TTCAAAAAAT AAATTTAAAG 
CTCTCTGATC CCAAAACTAT 
AGCATGTAAA GCAAAGGTAT 
AACATAAGGA TAAACAGAAT 
TGGAGACTTA TCTTTAAAAT 
GTCAAAATCT ACTATTCCTG 
ATTAT^TTTA TTATCAATGC 
ACTCTTCATG TCGGGATCCT 
AGGAGGAAAT TTTTTAAAAA 
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Wtcactatc taaagaattt 
tttcattttc aagaaatttt 
aagaaaaaaa atattcctta 
ttgggtaaat agcagattca 
aaggagctgg aatttcttta 
ccatgcaatc aatgtgccca 
ctaaaagttt ggcagtatca 
atataaataa aatattctca 
cataaaaaca tgcctcatca 
tttcaatatc aaaagagttg 

CTCTATATGG TATTACGTTT 
TAATAAAAAA TACATTAACC 
AAGATTTTTT TCTTACAACA 
CACTTCCCAT GGGTCCAACT 
TAACAGAGAC AATGTTATTT 
CCAAAAATCC TCCCTAAAGT 
TTAACATTAA AAGCCTAAAA 
AATAATAGCT TTCAAAACTA 
TCTCATAAAT TTCAATATAA 
CCAAACTTCC AACCACATAA 
TACTTATCAA GCTTGATCTA 
CATTTGGAAT AAAAAAATCC 
TAAAATTGCA GCCCAAATTG 
AAACATC TCT TTTATCAGGA 
AATTTATTTT ATAACCAAGC 
AAATTTTATT AAAATATTTG 
AAAAAATGCC TACTAAAACT 
AATAAATTCC TGACAAACTT 
CTAAAAGATC ATAAAACATT 
TCCTATAAGC TTCATCAGGA 
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ACAAAATTAG^RTTTTC AAC 80640 

TTATCTCCTT GTGAATATAG 80700 

CCTATAAGAT AATGGTGCAT 80760 

ATCTCTTCAT CGCCACCAAC 80820 

TTGTTGTAAA AATAATACCT 80880 

CAATAAGCTC CAAGCTTATA 80940 

TTGAAAGATT CTTTTCTAAA 81000 

TTTGATGAAA TTTTATTTAA 81060 

ATAATAAAAG TGCCACAACT 81120 

CTAGCATAAC CAATCTCATC 81180 

TCTGGATAAT CTTGAAACCT 81240 

CTACTTCTAT TTCCTTTAAT 81300 

AGCGAATCTT TATAAATTTT 8136 0 

ACAAGAATTA AGTTTATTTT 8142 0 

AATTTAGTAT CTTCTTTATT 81480 

AAATCAAATT CAATTATATA 81540 

ATTAATAATT TAGGATCTTA 81600 

TCATCATCTA ACAAAGCTTT 81660 

TTAAATTTTT TAGAACTATT 81720 

TTTTTAATTT CTACCAAATT 81780 

ATATCCAAAA CAGAAGTTTT 81840 

CTAATATCAA TCAAGAACAA 81900 

TTAAAAAGAT TAAACTTAAA 81960 

TATTGAACCT CAATCATATC 82020 

AAACTATTCT CATAATAAAC 82080 

CTATTTAAAC TAAATTTGCT 82140 

CCATCTTTTA GTATGGCAAC 82200 

TTATAATCTT CTAAAAATTT 82260 

TTATATTCAT TTATTGTCTC 82320 

TCTAAAAGTT TGTATTTTAA 82380 
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# 



AAAATAAATA CCGTGAATTC CATTATATTC TTTAATTTTG CCCTTAAGAA AAGAAAACAA 
AGAAACAGAA TTGTCATAAT TAAAATTTGA ATTTCTAAAT ATATAATTAA TGTTTTCGGT 
TGTAAAGAAA AAATCTTCTT TTGATCTAAT AGTATTAAAG ATGCTAAAAA AAAACCTATC 
TTCTTTTAAA AACCATTTAA AAAATAAAAA ATCATTACCA TATAAAGAAA AAGC C TT AAA 
TAAAATTTTT TTGAAATCAT CTCCAAAACT TTCAAGTCTT CCAGAAGCTT CAAGCCTCAT 
GCAAATTAAA TCTTTGTCTA TTTGCAATTC TTCATTCAAG TTTCTATACA AAGTAATAAT 
TTCTTCATAC ATATTCTTAT TTTCATTTTG TCTTAAAAGC AACGAAAAAT AATAAGTGTA 
TACTTCTTTA AGGGTTACTG CTTTAAAGTT ATTAATATTA ATTGCTTCTT TTAATAAATA 
AATTTTATCC GTTAAAGCAA CATTCTCTTT TAATGAAAGC TCATAAAGAG CATC TG AAGC 
TAGCAAATTG AAATCAAGAG ATCCATAAAT TAAATCATTT CCTTTTTGTA AATTGGAAAA 
ATTTTTAGAC TCATTGAAAA GAATACTTGA AATATCTTCA TCTTCTTTGA CTGAAATTGC 
ATATAGGTTT ATAGAAATAA AAAGAATGAA ACCCATCTTC AAATAAAACA TAAACACTTG 
CCTAACCTTT TACATCCTCC CCCTTTACTT CTTTTGTAGT AGAGTCTGAT GAAAATAAAT 
CGTATTCATC CTTATTAGCT TCAAAACCCA AAAGCTCTCT CACTTCTTTG TCTGTCAAAG 
TTTCTTTTAA AACAAGTTCT TTTGCAAGCT TAACAAGTTG ATCCTTATGC TTTAAAAGAA 
TATCCGATGC CTCTTTTAAA CATTCTTCAA GTATTCTTTT TACCTCTCTG TCAACTTTAT 
CGGCAGTGTT CTCAGAATAA GCTTTAGCCT TTGAAAACTC TTTCGGAAGG AAAATAGGTG 
CTTCATCGTC TACTAAAAAT ATTGGACCAA CTTCTTCACC CATTCCCCAC TCCGTAACCA 
TTTTTTTAGC CAAACTAGTA GCTTGCATTA AATCATTTTG AACACCCGCT GTTGTAACAC 
CCAAATTTAT TTGCTCGCTA GCATAACCAC CATAGCATAT CTTTATTTTG TCAAGAATTT 
GGTGTTTGTT TATTGAAAGT CTATCTTCCC TTGGAAGAGA AAATGCAACA CCAAGTGCCC 
TGCCCCTTGG AATAATGGTA ACTTTGTGAA GTGGATCAGC ATGTTCAAGA TAATAGTGAA 
GCAAAGCATG GCCTGCCTCA TGATAAGCCG TCTC AAGCTT TTGCCTATCA GTAATAGTCA 
TGGATTTTTT TGCAACTCCC ATCAATATTT TATCTCTGGC TTCTTCCATA TCCTTCATTA 
AAATTTCATC TTGATTATTC CTTGCAGCTA TTAAGGCTCC TTCATTAATT AAATTTGCAA 
GATCAGCACC ACTAGCTCCA GGAGTAGCTC TTGCTATTAC TTGTAAATTA ATATCTTTTG 
AAAGCTTCGT TTTTAAAGAA TGAATATTTA ATATTGCCTC TCTTTCCTTA ATATCAGGCA 
AAGAAACTGT TACTTGCCTG TCAAATCGTC CAGGCCTAAG CAAAGCAGAG TCAAGAACAT 
CGGGACGATT TGTAGCGGCC ATAACAATTA CATTGGTATG CGTTCCAAAT CCATCCATTT 



82440 

82500 

82560 

82620 

82680 

82740 

82800 

82860 

82920 

82980 

83040 

83100 

83160 

83220 

83280 

83340 

83400 - 

83460 

83520 

83580 

83640 

83700 

83760 

83820 

83880 

83940 

84000 

84060 

84120 
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iCTCTC TTTCATCATG 



CAACTAACAG CTGATTAAGA 1 
CACGACTTCG ACCAACAGCA TCAAGCTCAT CAATAAAAAT 
TAGCATTATC AAATAAATCT CTAACACGAC TTGCTCCAAC 
AATCTGAGCC TGACATGTGA AAGAAACTAA CCCCAGCCTC 
GCAAAGTCTT GCCAGTACCC GGAGAGCCCA CTAAAAGCAC 
CTATTTTTTC AAATTTTTTT GGATTTTTAA GAAATTGGAC 
TAACCTCTTC TTGACCAGCC ACATCTTTAA AGGTGATTTT 
TTTGAGCATT AC TTTTCCC A AATGTAAAAA CCTTCCCACC 
ATATAAAGAA AAAGAAAAT A AAAAACAAAA TCCATGGCAA 
TCAGAGAAGC TTGACTTTTC CCTGAGCTAA GCTCAACTTT 
GTAAATTTAT ATCAAGATAG GGAATGCTGG TAGAAAAATA 
CCTTGACGAC AAATTGAATC AAATTTTTAT CAATTATTAC 
TGTCTAAATA ACTCTGAAAA GTGCTATAAG GAACATTTTT 
TAAAATATGA CATAAATATT GCTGAAATTA GAAAAACAAC 
TTTTATTCTT TTTTTTGTTG TTAGATTTTC CATTGTTATT 
TTCCTTTAAA AGCCCTCCAA TCAAAGATAT ATTAATTTTT 
CCATACTAAG TTTAAAGTAT TTAAATCAAT AATCCCAATT 
CAACATTAAA TAAGCCGGAT TACACCTTAT AAACTTAGAA 
TCTATCCTTA AAAAATTTAT ACCTAAACTC ATAAGAACAA 
AGCTGCATTA CAC TCTAAAT ATTTTAGCAA AATCTTACCT 
ACCAACTTCC AAAATAAAAT CAAAAGGTTT GTAAAATTTT 
ATTAATTTTA TTATGCCTTT TTTCTAAAAA AAAATCATTG 
TTTTTTCCTA TTAATCTCTA CTTTAAACGC TTCATTAAGA 
TGCAATTCCT TCTGAATTTA AAATTTTAAA AATCAATCTA 
ATCTAAGAAA GTTTTCAGAT CAAAAGAATA ATAATATTTA 
TTCATCTTTT CCAAAATAAT CCGCAAATTC CTTTGAAAAT 
TTTTTCATAT CCTTTAAAAA CCTTTTTTAT AGCGGGTAGC 
TCTTAGATAT AAATTTTGAG CATTTGTACT ATCAACAAAA 
TAAAAAATTT TCAATTTCTA GTCTTGAAAC CTCAAGCAAG 
GACACTAGGA ATACCTGAAA GACCATCCAA AAAAGATCCT 



# 
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ACCACCGCCA^GCCCCGCAC 84180 

AATACATGGA GAATTTTTTC 84240 

CCCAACAAAC ATTTCAACAA 84300 

ACCGGCAACG GCTTTGGCAA 843 60 

TCCTTTGGGG ATTTTTGCAC 84420 

AACTTCTCGA AGCTCTTGCT 84480 

ATTCTTTCCA GCTTCATACT 84540 

GCCACCTTGA GTTTGACGAA 84600 

AGTTTGTAAT AAAACCCCAA 84660 

TTTATTTTTT AGTTCTGAAA 84720 

AGACTTTGCA AAGTTAGAAC 84780 

TACAGACTCA ACTAGACCAT 84840 

ATAGCTTTCC CCCCCCCTTA 84900 

AACAAGTCCT AAAATCCAAT 84960 

CATATTATTA TTGCCATTCA 85020 

TTAAGAATGC TTTTTTC AC T 85080 

AACCTGTTAT CTAATGCTAA 85140 

AAAAATTTTT TTGCTTTCAA 85200 

CATTTTAATC TTGATACAGA 85260 

AAAGATAAAC TATGCCATTT 85320 

TCATCCCTTT TAAAAATTAA 853 80 

GTTTTTAACA AAACATTATT 85440 

GCTTTATAAG AAACTTTGGC 85500 

AATACCAAAT ACTTAGGAAA 85560 

CCTTTCTCAA CAGGAAAAAA 85620 

TCAGATATTC TTTTAAGACA 85680 

AAATTATTTC TAACCCTATT 85740 

AACCCAATAT TATTCAAAGA 85800 

GGCCTTATAA TGTTTCTATT 85860 

TGAAAAAATC TCATAATTAT 85920 
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TGTTTCAATT 

AAGAGCATTT 

CAACCTAGCT 

GTAAAGATCG 

AATAAAATGA 

TAATAAAGCC 

AGAATTTTTT 

TAAAAATTAC 

ATCAGCTCAT 

AATCTAGAAA 

AGTCTACTAT 

GACACACCAC 

ACCACTAATA 

CCACTTAGAT 

AATTCTGAAT 

TTATCCAAAA 

CCTAATCCAA 

GTCATTTTCT 

TCCTTTATTA 

CCTTACCTTA 

AGCTCCAATA 

AAAATCAATA 

ACATCTTTCA 

TTTCTTATTA 

AACAACAGCT 

TGATCTATAT 

TACTGGGACG 

TTGTCGACGT 

AGCTTTGTCA 
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TGATCGTTTT 

TCTAAAGCAA 

GATTCACTTT 

CAAAACCCCT 

GCAAAATAAA 

GTAGAATCAG 

TTATAAAATT 

TAGAATTAAT 

CACTTAAAAA 

GAACAAAACT 

AGTTAGAGCT 

CTCGCTCTTT 

AATTCTTTAT 

TCATATAGGT 

ATTCATACTT 

GAGAAAAACC 

GTATCAACAA 

GCAAGCCTAA 

ACAGGAGTTA 

ACCTGCTCTT 

AGCTTAATAG 

TGATAAATGA 

AGCTTGCCAT 

AATTCACTAC 

GGTATTTCAG 

TGGCAACTTA 

GAAGGATTCG 

CCCAAATACC 

ACAATTATCT 



CGTTATGAGC 

TATATCTAAA 

TTATATCTAT 

TTACATGCTC 

ATGCAATAAC 

CCCCGCCAGA 

TATCTATTTT 

TTTTTGAACC 

ATCATACAAT 

CTTTATATCA 

TCCAAGAACA 

AAGCCTGCAT 

ACACATATAA 

CAATGGCTTA 

TTTTTTTCTC 

AACATTATGC 

CCCCATAAAA 

CATTAGAAGG 

AATCAAGCTC 

TTAAAACAGT 

GAACATATTT 

GCTTGCTGGC 

CATCTAATAT 

TCTTAATTCT 

ACTTAGCTCT 

AAACCCTGCT 

AACCCTCGAA 

TTATTAATAC 

AAAAATCTAA 
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AAGTGCAATA TAATTTGCGC CATTTTCTTT 85980 

TTTCCTTGCA AGCTCCTCAA TAGATACTCC 86040 

ATCACATTTT TTAATTTGTA AAGCAATATT 86100 

TATCTCTTGA TTTTGCTCAT TATCAGATCT 86160 
ATTATTGCTC AGGTAATATT TTAAATTCAA ■ 86220 

AAAAGCAACA ATAACCCGAT TTTTATCTAA 86280 

AATTTGTATA TTTTCGTCTA AAAAATGCAT 86340 

TTATTTTTAA AATTTGCTTC GCTAATATCA 86400 

TTTTCAAGTC TATGCATTTC ATCCTTGCAA 86460 

CGCATAAGAT TGCTTCCAAC CCCAATATAA 86520 

CTTGAAATCG ATCTAAGACC ATTATGAGTA 86580 

TTTCCCAGGG GAAGATCAAC ATTGTCAAGA 86640 

AAATCTGAAA ATATTGAAGG AAATAGAGAG 86700 

ACCAAAATTA CTCTTCCGGA AATCATTTTT 86760 

TTTAAAAAAA GGCCGTTTTT AGAAACAATT 8 6820 

CTAGTTAAAG AAAATTCTAA TCCAGGGTTT 868 80 

TCACTTTATA ATAACAACTT CCAAATTCTC 86940 

CAACACAAGA TCTTTAAGAA GAACGCTATC 87000 

TATGAATTCT GGCAAATCCA AGGGCAAAGA 87 060 

CAAAATTCCA CCTTCTTTAA CCCCAATAGA 87120 

TTCAAGCTCA ACATTCCTAT CTACTTCGTA 87180 

AATATTTTCT GCAGCATCTT TAACAAAAAC 87240 

TAAGACAGTA TTATCCGTAA ATTTTGCAAA 87300 

CAAGTGTGAA ACATCCTTAC CTTGCCCGTA 873 60 

TATTCTGCGA GCATTAGAAG ATCCAAAGCT 87420 

ATTTTCCACC AAAAAACTCC TAATGTAAAA 87480 

TGACTGGACC AAAACCAGTT GACTTACCAC 87 540 

TTACAATTGA CAAGCA^ACA AGATACAACG 87 600 

TTCAAGTTCA AGATCGGCTG GATTTTC TCT 87660 
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T AT AT ATTC C TTAAAAATAG^RgTTTGCAA AGCTTTTCTA AAATCCTGAC^KATAGGATG 87720 

TACAATGTCT TTATATTCAC CGACTCTAGT TCTTCTGTTA GGCATCGCAA TAAATACTCC 877 80 

CTTTTGCCCT TTAATAACTC TAATATTGTG AAGAACCAAA CAGTTATCAA AAGTAACTGC 87840 

AACATATGCT AATAATTTAG AACCAGAATT TTTACTATCA ACTTTCTTAA TCCTTATGTC 87900 

TGTAATATCC ACTTATAAGC CTCCCGCAAA AAGTACATAA CTTAAATCTA AAATATTTTC 87 960 

TATTTTTTGT AAACACGTTT TTATGATATT TTTAGTTTTT TTAATTAATT TTAATTAAAC 88020 

TAAGTAATTA GGGATAAATA ATGGTTCCTT TGGACCATAT TTTTTCAAGC TCATAGTATT 88080 

CCCTAGATTT TTCACTCATT AAATGAATTA CCAAATTTGC ACCAGAAACA ACAGTCCAGT 88140 

CATAAACCAA CCCTTTTCCT TCAGCATTAA GATTAATTTT TTTTTCTTTA AAGAATTTAA 88200 

TTATCTTGTC AATATATAAA GCTTCCATTT GCTTAAATGA TACAAAAGTG GCTATTATAA 882 60 

AAAAATCAGT CCAATTACAA ATATCGCTAA CATTAATGCC TATAACATCA ATTCCATTAA 88320 

AATCACTTAT TATTTTACAT AAATCATTAA TATCATTTAC TTTTAACATA CCTTCCATCA 883 80 

AAATCATCTC CTAAAATAAC TATTACATCA GGATTAATAT CAAGATTATC AAGCTCTAAC 88440 

AATC TTTTAG CTTGAACCTC AGAAATTGGC TTAATATTTG AAGTTTTAAT TACCTCTCCA 88500 

ACCCTAACAG CCATTTCTAA ATTATCCGAA TTATTTATAA TCAGGGTATT TTTATAAGAA 88560 

TTTTTATCTG CATTACCAAA TTTTAAAACT TTAAATTTTA AAGAATTAAA AATATTTGCT 88 620 

GTTTTTTTTG CAAGCCCAAC AACTTTTGTT CCATTTAAAA CAACAATCTT TACTATCTCT 88 680 

TCTGCACCTT CATTAACCAA CTCTTTGTTT AATTTATCCA CCGATTCTTT TAAAATAGCA 88740 

CCCCCATAAT AAGGAAAAAC CACCTTTATC AAATTATTAT CATTATCCTT AAAAATCTCT 888 00 

TCTTGTCCTT TAATATTAAT AGAAATAATT TTATCATTAT TTATTTTATA ATTTTTAACA 888 60 

ATATACTTAA AAACAACCTC TGAAAGGTTA GTATCTAACA TGGAATATAT TTTAAAAAAA 88920 

CTGTCATTTT CAATGCCAAA ATCTGAAATT TGAAAAAGAA GTCTTTTAAA AAATTCTTTA 88980 

AAAAATTCAA CTCTCTCTTC AAACTGATTA ACATCATTAA AATATCTCAA ATAATCATAA 89040 

GCCTTATCAC CATCAAAATT AGAAGTGCCA GAGGGTATTA AAATAGAATC CTCGAAACTA 89100 

TAAACTTTCA CTGGGTTTTT AACAAGAAGT CTAACTCCCC CTAAGTAATC AATAAGCCTA 89160 

ACAAAATTTT CTTTTTGAAA ACGAATATAA TAATCTGATT CATGAGATAA TTGTGTATAA 8922 0 

ATTTTAGATA AAAATTTATT AAAAGAATTT TTTTTATAAA G ATC TTTAAA CCAAGATATA 89280 

TTCCCTTTTA AATCTTCATA TCCAGTATGA ATTGGAATAT CAAAAAACCC AATATTTCCT 89340 

GTTTTCATAT TAATAAAAAT TTCTTGCATA CTTACAAGGT TTTTGTTAAG ATCTTC TATT 89400 

AGAAACAAAA AACTAATATT ACTCTTTGTA TTAAGCTCGA AGTAAACCAA CTCTTTTTTC 894 60 
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GAACTTCTAA TAAAAAAAAT TACTACACTT GCTATTATTA AAACAATTAA AAATAAAAAA 89520 

ATTAAATCCT TTCTCAAACA TTTACCTTTT TAACATATAA ATTATTATCT TTAATGTATT 89580 

TTAACACACC AAAAGGCAAT AAATAGCTGA CGGGCAATCC ATTTACAATT CTATTTCTAA 89640 

TCTCTGATGA GGAAATCGGT ATTATTTTAT TATCTATATA AATATGCTTA AAAGAACTTT 89700 

TAAGTCTCTC TTTGTAGATT CTATGAGCAA CAACAAGTTC AACAGAACTT ACAATACTTT 89760 

GAGGATCTTT CCATGAATCA AAATTTTGAA AAAGATCATC GCCAATAATT AAAAAAAGTT 89820 

TATCGTTTTT GTATTTTTTT TTAACACAAG AAATAGTATC AACAGTATAA GTTATACCAC 39 880 

CATTTATTAT GTCGCAATCA TCTATGAACA TTTTATCTTC ATTCTCTAAT GCAAGCTTGA 89940 

GCATATCTAT TCTATTGCTA ACACTAACAT TCTCATCAAT CAATTTATGA GCTGGATTGC 90000 

AAGTAGGAAT AAATATTACT CTATCAATAT TTAATAAATA CTCTATTTCT TTAGCCAAAA 90060 

AAATATGTCC AATATGAACT GGATTATAAG TGCCCCCTAA TATTGCAATT CTCACGATTT 90120 

CTTTCCTAAA TAAAATCTGA TATCCAAAAA CCAAGTCTTA AAAATAAAAA GCCCACAATA 90180 

AAAATCAATT TTATTAAAAA GTTTTAGCCA AAATAAAAAA TTCTTTAATA AGTTCATCAA 90240 

TTCCTCTATT CTCATAAATA GAGATGCCAA CAACCTTTTC TTTTCCTAAG GCTTTTATCA 90300 

GGCAATCAAA ATTTTTCTCA GAACCGTCCA AATCAAGCTT GTTGGCAATA ATAATTTTTT 903 60 

TTTTATTAAA AAGCTTATGG CTATAAGATT TTAATTCATT TAAAAGAATG TTATATGACT 90420 

CCAAAAAATT TGCTTCAGAA ATATCAATAA CCAAAGCTAA AATTTTAGTT TTAGCAATAT 90480 

GCTTTAAAAA TTTAGTCCCG AGCCCTACTC CAAAACTAGC ACCTTTAATT ATTCCGGGAA 90540 

TATCTGCAAT AATCAAATCA TCATAAGAAC GCCTGAGCAT ACCAAGATGA GGAATC TTTG 90600 

TTGTAAAAGG ATAATTTGCG ACCCTAGATT TTGCTGAGGT TATCCTATTA AGAAGAGAAG 90660 

ATTTACCAGC ATTGGGTAAT CCAACAAGCC CAATATCCGC CACCAAAAAA AGTTCAAGAC 9072 0 

GCACGCTCAA ACTATTACCC GATTCTCCAG GTTGAGCAAA CCTTGGAACC CTTCTAACTG 90780 

AAGTTTTAAA ATTCCAATTA CCAAGACCCC CTCTGCCACC TTTTAAAACA ACAAATTCGT 90840 

CATTTAAATT TTTAAGCCTA TACAAAAGAG TTCCATCATT TTCATTATAA ACTTCTGTAT 90900 

TTGGAGGAAC AAAAAGAGTT AAATCTTTAC CATTAGCACC ACTTCTTTTA AAACCCATTC 90960 

CAGGTTTACC ATTTTCAGCA CAAAGCACAT GACCATTTTT GTAAAAAGAT AAAGTGCTAA 91020 
GATTTTCCCT CACCTTGAAA ATTACACTCC CACCACTCCC ACCGTTTCCG CCATCTGGAC , 91080 
CACCTTTTGC ATTAAACTTT TCTCTTAAAA AAGAAACACA CCCAGAACCA CCATTGCCCG ■ 91140 

AAACTACCGT TATATTTACA GAGTCCTTAA AGTTATACAA ACTTTCTCCA ATTTTTCAAT 912 00 
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TAAAACCAAA 
CCTTTAAAGT 
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aatctccaat^Wtttcaatt AAAACTAAAC 

TTTAAACTCT ACCTTACCAG ATGAAAGCGC 
GTTTTTACCT TTATGAAACT TTGTACCTCT 
AAACTGACCA CCACTTCTTT TAACTCCAAG 
TGAACTACCA CCACTTTTAC TTGTTGCCAT 
TTAGGATACT CAAAGGACAA ATCATTTATG 
AG ACTTTC TT TGTTCAAATC CTTAAAAAAG 
TTTTTCACAA CAAAAGCGTC ACCCTCAAGA 
AAAGAAAAAG AAGAACAGAC AACGTTAACA 
AAAAGATAAA TAATTACATC GTCTTTTACT 
AAACTATTTC ATCAACCAAA ATATAAGAAT 
TTGATTTTCT TCTTCTGTAT CTGTAAGAAA 
AGGTACATCT AATAAGAGAA TTTACGACAT 
TATTAATAAG CAAAACACTA TTAAATTCCA 
CTATTTTTAA AAATTCACCC TCAATAGCCT 
CATACATATA TTACCTCAAC TAAACATTGT 
AAATTAATTA TACCTTACAA GGGAATTGAC 
AATAGCGCAA AGCTCAATAA AACAAAAAAT 
CAAAATTGCA GACGACTTTA AAGTTCCACC 
ATTGGAATAC GTCCTAACAT CGTCTTTGTG 
CTCATACTCT TCACTAAAAA CCTCTCTGGG 
GGGTAGCTGC ATTTTTAAAG ACAAAGGAGC 
TGCAATGCAA TCGATCTTTT TAAAATTATA 
ATAAACTTCG GGTTTTAGCA AAACGCTAGT 
AAAATTGGGT ATTTTTGAAA TAAACTGATC 
CTTATCCCAT CTATATGTTT ATATATTTTT 
TTTTCACAGG GGGCGCTGGA ACAAGATTAT 
TAGCTCTTGT CCAAATTTTA CTTCTACCAA 
TTCCGGTTTT ACTATCAACA CGCATCTCAG 
TTATTTTGCC CCTATCCCAC TTTTTAGAAG 
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AATACTTACG^TOTTTTCGCC 912 60 

AAATATTGTA TAATCTCTTC 913 2 0 

TTGTCTAACA ATTATCTCTC 913 80 

TCGCTTGGAT ATAGAATCTC 91440 

TAATTTTCCT CCAAAACTAA 91500 

CCTCTAATTA AAAACCTACT .915 60 

GGCTTAAATT CTAAATAACC 91620 

TCAAGAACAC TAAAAAAGGT 91680 

TTATTCTTAC CTATAGCATG 91740 

TTTACCAAAA CATTAATCAA 91800 

AGGTTTGCCT GTGCCCAACT 91860 

CAACCTTTTT ATCTTTTTTA 9192 0 

AAGGCTTTCC TATTTTAACC 91980 

ACTTATCTTT TTCAACAGGA 92 040 

TATATTGCTT GCCATTTATT 92100 

AAATTTAATA AAAAAGAAAA 92160 

TTCATAACTT TCAAGACTTT 92220 

ATCCTTAACC TTGCCCCCGG 9228 0 

GGTAGCTAAT ATGTCATCTA 92340 

CACCTCTATT CTCCCAAAAC 92400 

CAATTTACCC TCTTTTCGAA 924 60 

AC CTATTAAA TATCCCCTAG 92520 

AAAAGAATAT ACTTCATTTA 92580 

AATATCATAA AAAAGAACAC 92 640 

ATAATACTCT GTCTTATTTT 92700 

CAAATATCAA GACCAAAAGC 92760 

GCAAATCTAA TTTTGGTATT 92820 

AAATCCAAAC -TTTCCCCTTG . 92880 

AATTATAAAT TTTACCGTTT 92 940 

AAGAAGAATA CTTAAGACCC 93000 
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CACATAAAAT CAAGACCCTC TATTGCAAGA TTTTCAAACC CAACTACAGT ATCTCCTGAA 
GGATTTTTAG CATCATACTT TTTGCCATCT TTTATTATAG TTAAAATTCG GCCATAAACT 
TCCCCATTAT ATTTATAAAT ATAGATAATA GAATTCTTTA TGTTACTTAC ATCATTATAA 
CCAACCCAAT ATCCTAAAAC TTCATTTTCA AAAACAGGGT TTTCATCCTT GCTAACAATG 
TCCTTTTCAT TTGAATCTTC TGAATTTGCA AATAAAAGCA TTGAAAAACA AAAAAAAAGA 
AAAAACTTTG AAAAAACTCT AGTCATCAAT CCTCCTTAAA ACCAAATTAT AGCTCTTTTT 
TTAAATTACT CATGTAAGGC AGCTAATTTA AAAATTAGCC ACAAACATCA TTATACAACT 
TATTTTATAA TTTTAATTAT TAAGATAAAA ACCTAGACAA AAAAATATAA AATTAAGCAA 
AAACACAACA GGCCAATTAA TTGTTATATG GGACAATTAA TGCTAATATT AATAAAGTTA 
ATTGTCTTTA AGGATTTAAC CGTGGTAAGA GGAATTTATA CAGCTGCCAG CGGAATGATG 
GCAGAAAGGC GCAAGCTTGA TACCGTGTCA AATAATTTGG CAAACATAGA TCTTATTGGA 
TACAAAAAAG ATTTGTCTAT TCAAAAAGCA TTTCCAGAAA TGCTAATAAG AAGACTAAAT 
GATGATGGTC TTTATAAATT TCCCAAAGGA CATCTTGAAA CAGCTCCGGT TGTGGGCAAA 
ATAGGAACAG GGGTTGAAGA AAATGAGATA TACACAGTAT TTGAACAGGG CCCATTAAAA 
ACTACTGGCA ATCCATTAGA TTTAGCACTC ACCGATCAAG GATTTTTCGT AATACAAACT 
TCAGATGGAG AAAGATATAC AAGAAACGGT TCTTTTACTA TTGGAAAAGA AGGAATCCTT 
GTTACAAAAA GCGGATTTCC CGTTCTAGGA GAAAAAGGAT ACATATATCT TAAGAAAAAT 
AATTTTAAAA TAACACCTCA AGGACAAGTC TTTCACAATT CAAACTTTGA ATCAGACCCC 
AAAAGACTTG TTAGCGAGTA TGAAAATTCT TGGGAAAATT ATGAGCTGCT TGATACCATT 
AGAATTGTAA ATTTTGAAAA TCCCAGATTT CTCAAAAAAC AGGGAAATTC TTTATGGATC 
GAT AC AAAAA CATCTGGCAA AGCACAAGAA ATTGATATAT CATTAAGGCC TAAAATAGAA 
ACAGAAACAC TTGAGGCTTC CAATGTTAAT GCTGTTAAAG AAATGGTTTT AATGATTGAA 
ATTAACAGAG CTTATGAAGC TAATCAAAAA ACAATACAGA CTGAAGATAG TCTATTGGGA 
AAATTAATAA ATGAAATTGG AAAATATTAA GGAGCATGTT TTATGATGAG AGCATTATGG 
ACAGCAGCAA GTGGAATGAC TGCACAACAA TACAATGTAG ATACAATTGC CAATAACCTT 
TCAAATGTAA ATACTACAGG ATTTAAAAAA ATAAGAGCAG AATTTGAGGA TCTAATTTAT 
CAAACCCATA ACAGAGCAGG AACCCCTGCA ACTGAAAATA CTTTAAGACC ACTTGGAAAT 
CAAGTTGGTC ACGGAACAAA AATTGCTGCC ACCCAGAGAA TATTTG AAC A AGGAAAAATG 
CAATCCACAA ATTTACTCAC TGACGTTGCC ATTGAAGGAG ATGGATTTTA CAAAATTCTT 
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CTACCTGATG GAACTTATGC^IByrACTAGA GATGGGTCAT 
GAGCTTGTAA CAAGCCAAGG ATACAAAGTA TTGCCTAATA 
ATCCAAAACT CAATTACAAT ATCTGAAGAG GGAATAGTAT 
AACGAACCAA TAGAGCTTGG GCAAATTGAA ATATCAAGAT 
AGTGCCATTG GAAGCAATTT ATTTAAAGAA ACAGCTGGAT 
ATACCAGGAA GTGAAGGCAT GGGAAGACTA AGGCAAGGCA 
TCTATTGCTG AAGAAATGGT AACAATGATA GTAGCTCAAA 
AAAGCTATTC AAACTTCTGA CAATATGTTA GGAATTGCAA 
AATAAAAAAA AGATTATTTA TTTTTATTTT ATTTTTCACA 
TTCTCATGAT TTATGTTTCA ACATTGCGCC TAGTAAAACA 
TTCAAAAATA TGTAACAATC AAAGCTTATC AAAAATATAT 
AAAATCAATA ATTTTTGAAA TGATTTATTA CAT T ACAAAA 
CTATATACTT CAATTTAACT TTGATGAATC TGAAATAAAC 
AAAAGTAAAA TTTAAGGTAA AAAGCAACAA TTCATACAAA 
TCTTGTTTAT TATGCAAAAA ACTTTGAAAG CTACAAAAGA 
CATTGATGTA ATCGAGCCAA TTGTATTTGC AAAAGAAAAT 
TAATGAGTAC AATACATACT TTAAATACAA AATTAACACA 
AAGTCTAAAT GAATTAAACA ATAGCAAATA CAAAGTTATA 
AGAGATAAGA TTAAATAAGG TGCAAAAAGA ATAATACCTA 
AAATTATTAT TTTAATCTCC CTTAATGCAG CTAATATTTA 
TAATTTAACG AAAAAAGTTT CATTAATTGC AATAATTGAT 
GAAATACAAT AAATAAGGTA AAGAATGAAC AAACTAATGT 
ACGAGTCTAT TAGCCCAAAC AAACAAAGCT TCAACAGGAC 
AACAATAGCC TATCTGAAAG CGTAAAATTA AAAGAAATTG 
ACAAATTTTT TAACAGGTAT TGGAATAGTA GCGGGACTTG 
AAACAAAAAG ACCTTATAAT TAAAATTTTA GAAGAAAACA 
TCTAATAACA TAGAAAGTAA AAATATTGCA CTAGTAAATG 
AATACAATCA AAGGTTCAAA ACATAAAGC^.: TGCGTTGCAT 
TTAACAAATG GAAT AC TTTT AAAAACAAAT CTTAAAAATA 
ATTGCATCAG GAATTACACA GCCCAATAAT AAATTAAAAG 



# 



TTAAAATCGA^WfcTAATCGA 

TACTCTTCCC AGAAGAATAT 
CGGTAAAAAT TGATACCAGC 
TTATCAATCC TGCAGGACTA 
CAGGCCAAGA AATAGCAGGA 
TACTTGAAAT GTCAAATGTA 
GGGCTTATGA AATAAACTCA 
ATAACTTAAA AAGGCAATAA 
ACAAGCTCAA TTATAAGAGC 
TATTTCTTTT CAAAGAAGTA 
ATCCCCCCAC ATTTAACAAA 
AATTTATCAA ATGAAAATAT 
ATAGAAGATA AATTTTTCAA 
AATATTCCAA TTGAAAAAAC 
CACAATTACA TCAATATGTA 
CTAAAAAAAA ATGAAATCCT 
ACAAGAATAA ATGATGTTTT 
CGCAACACAA TCAAAAATGA 
ATTTTATCTT CCTTTTCTAA 
ACAAATCAAG GATTAATTAG 
ATAAAATAAT AGATATTAAA 
TGATGTTAAT TACATTTGCA 
TAAAAACAGA TCAATCATTT 
CGGATATTTA TCCCACAAAT 
CTGGAAAAGG AGACTCTATA 
ATATAATAAA TGAAATAGGC 
TCAGTCTCCA AGTAAAAGGT 
CAATACTGGA CTCAAAAGAT 
AAGAGGGGGA AATAATAGCA 
GATCTGGATA TACTATAGAT 
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AGTGTAATAA TAAATGAGAA TCAAAATATT AACCACAGTT 
GGAAATTATA CATTAATAAA TAGAATTCAT AAAATATTAA 
AAAATTAAAT CAGACAGCAC AATAGAAATA GAAGCAAAAA 
ATTGAAAATA TTAAAATAGA AACCAACCCC AAGATATTAA 
ATTTTAGCAA GTGAAAATGC AAAAATAGGA ACTTTTACAT 
CAAAACATTT TTTTAAGTAA AAATAACAAA ACAACAATTC 
AATGAATTTA TATTAAAAAA TTCCAACAAT CTTAGCAATA 
CAAGCTGCGC AAAAAATTAA TAAATTAAAT GGGGAACTTA 
AACCAAAATT AATTCACAAA ATCTAAAATT TAAAAATCAA 
TGTAGAAATA AAAAAATCCT TTCAAAAAAA CGAAGATCTT 
TGAAGCTATG TTTATCAAGC AAATGCTTGA AAGCATGAAA 
AAATTTGCTA AACGGAGGCC AAGTAGAAGA AATTTTTGAA 
AGCAAAACAA ATGGCACAAG CTCAAAGCTT TGGCCTTGCC 
ACAAAAAAGT AAATAATTCA AAAAATACTC CCCCTAAACT 
TTTAAAACCA TTTTTAAATT AAATTGGCAC AGTTTTTGCA 
CTTAATCACA ATATTCAAGA AAGGGGAGAA AATATAATAA 
GAGGATTTAA ACATATATTT AAAATCAGTA AGAGAACACA 
GAAATCAAAC TTGCAGGACA AATACAAAGA GGCAATGCAA 
AATGCAAACT TGCGACTTGT TTTAAAAATA ATAAAAAGAT 
ATTGAAGACT TAATTCAAGA AGGCAACTTG GGATTAATAA 
CCGAATAAAA ATACCAAATT TTCAACTTAT GCATCATTTT 
AGAGCATTAA ACACTAAAAC CAGATTGGTA AAAGTCCCAT 
CTACAAATAA ATAAATATTT AACAGAAGAA GAAAAATCGC 
AAAAGATTCA ACCTATCTCC TGCTCAGTAT ATAAAAATTA 
TATTCTCTGG ACAAAGAAAT AGAGGGATCT GAAAATTCAA 
GATAATTCTT TTAACCCTGA AATTACCCTT GAACAAGATT 
TATATACTTG AAACAAAATT AAATGAAAAG GAAAGATACA 
CTGGACAATA GTCCCAAAAA' AAGCACCTTA AAAGATATTT 
TCAGAAACTG TAAGACAGAT TGAAAAAAGA GTTCTTAAAA 
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ATAATATAAT TCTTAAAAAA 96600 

CCTCTAAAAA AATCAACAAC 96660 

ACATAAGCCT ATTAGAAGAG 96720 

TAGACAAAAA AAATGGTATT 96780 

TTTCCATTGA AAAAGACAAT 96840 

AAGTAAACTC AATGAAATTA 9 6900 

AAGAATTAAT TCAAATAATT 96960 

TCTTGGAGGA AATTGATGGA 97020 

ATAAATAATT TTAAAAATTC 97080 

CGAAAAGCTT CTTTAGAATT 97140 

AAAACTCTTA ACAAAGATCA 97200 

GATATGCTTT GCGAACAAAG 97260 

GATTTAATTT ACAATCAATT 97320 

CAAAATTATA TCCTATTTAG 97380 

TGGAAATTAA GTAGTAAAAA 97440 

CTATGAACAT ATTTAGTAAT 97500 

AGC TAATTAC TCACGAAGAA 97560 

AAGCAAAAAA CAAGATGATA 97620 

ATGCGGGTAA AGGGTTAAAA 97 680 

GAGCTGCTGA AAAATATGAC 97740 

GGATTAAGCA ATCACTACAA 97800 

ACAGAAAAGA AAATCTAATA 97860 

CCAAAAAAGA AGAAATAATG 97920 
TTCCCTATCT TGAAAAAGAA . 97980 

CACTCTTGAA TCTATACGAG 9804 0 

CAACTCTAAA ACATTTGAAT 98100 

TAATTAAAAA AAGATATAAC 98160 

CAACAGAACT TGGAATATCA 98220 

AATTAAAAGA AGAAATAAAT 9 8280 



WO 98/58943 



213 

TAACATTGAC ATTCATGACA"^WTTCTGGTC TACTTGTAAG 
TATTTTATAT TAAAAAAAAC AAGTTTATTA TTGTAATTTT 
TTGCAATAAC TCAGGCATTT GCAAGTTTTT TATATTTTAA 
ATGCCCCACT TAAAAATAGG TTTGAAAAAA CACAAAAAGA 
ACAACGAGGA TAAAAAAGCC AAAAGCAAAC CTAAGTTTTA 
GCTATGATGA ATTTATGTTA GAACAATTTA TAAAACTTAA 
TTATTCCATT TTTACCAAAA TCAATGAGTT TATACAAAAA 
CAGTAATAAT ACATTTCCCA ATGCAATCAA AACATAGAAA 
TAAACATAAA AGATAAAAAA GAAGAAATAC ACAAAAAAAT 
ATCCTGATGC AAAAATAATG AATAACCATA TGGGAAGTTT 
TGATGAAAAT CATTTTAGAA AAGCTTAAAG AGATTGACAG 
CTATTGCAGG AAGCGTACCA GAAATAATAG GCAAAGAAAT 
GAGACGTATT TCTTGATAGC AAAGACACAG AAGAGTCCGT 
CAAAAAATAT TGC TAGAAAA AATGGAATGG TAAAAGTAAT 
ATACGCTAAA AGTCCTTAAA AAAGAAGGAC CTGATTTAAA 
ACTTATTAAA TCTTTACGAG GAAACAATCA GATGAAAGTG 
TGACGACTGT TGCGTAGCTG TAGTAGAAAA TGGAATTCAT 
AAATCAAACC GAACACAAAA AATATTACGG CATAGTGCCT 
TACGGAAGCT ATTATGTCTG TTTGTATAAA AGCACTAAAA 
TGAAATTGAC TTAATAGCTG TAACATCTAG ACCTGGACTT 
ATTAAACTTT GCCAAAGGTC TAGCAATTTC ATTAAAAAAG 
CATCTTGGGT CATCTTTACG CCCCTTTAAT GCACTCAAAA 
ATTATTATTA AGTGGTGGAC ATACATTGAT TGCTAAACAA 
AATACTTGGA AGAACTCTAG ATGATGCTTG TGGAGAGGCT 
TTATGATATG GGATTTCCGG GAGGTCCAAA CATCGAACAA 
AAATACATTT CAATTTCCAG TTACCACCTT TAAAAAAAAA 
ATACTCTGGA - CTAAAAACAG CTTGCATACA CCAACTCGAA 
CCCAACAA^A AAAAATAATA TAGCTGCAAG CTTCCAAAAA 
CACCCCACTA AAAAGGGCAA TAAAAGATAC TCAAATCAAC 
TGTTGCAAGC AATTTATATT TAAGAGAAAA AATAGATAAG 
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TCAGTGGTCtf^GAATGTTTG 98340 

TATTATAATT TCTATTGTTA 98400 

TGACAATTCA AAAATTGCAA 98460 

AAGCTTAATA ATAAAAAACA 98520 

CTTAATCATT GACGACGTGG 98580 

TCTTAAAATA ACTTATGCTA 98640 

ACTAAAAAAT GCTAACAAAA 9 8700 

TTCAATAGAA AAATTTCATA 987 60 

CGAAAAAGC A TTTAAAAAGT 98820 

AATCACTTCA AATAAAGATT 988 80 

ATATTTTTTC GACAGCGTAA 98940 

TGGAGTTAAA GTAGAAAAAA 99000 

AACAAAGGAG CTTGAAAAAG 99060 

AGGACACATT TGGTCTAAAA 99120 

CCAGGAATTC GAATTCGACA 99180 

CTTGGAATAG AAACCTCTTG 99240 

ATTTTAAGCA ATATAAAATT 99300 

GAGATTGCCT CAAGACTTCA 99360 

AAGGCAAATA CTAAAATATC 99420 

ATTGGATCTT TAATAGTTGG 99480 

CCCATTATTT GCATTGATCA 99540 

ATAGAATATC CATTTATATC 99 600 

AAAAATTTCG ATGATGTTGA 99660 

TTTGATAAAG TGGCAAAACA 99720 

ATATCTAAAA ATGGAGATGA 99780 

GAAAACTGGT ATGATTTTTC 99840 

AAATTCAAAA GCAAAGATAA 99900 

GCTGCCTTTG AAAATCTAAT . 99960 

AAATTGGTAA TAGCAGGAGG 100020 

CTTAAAATAC AAACTTACTA 100080 
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CCCTCCTCTT 


GACCTTTGCA 


CAGACAATGG 


AGCAATGATT 


GCGGGACTTG 


GATTTAATAT 


100140 


GTATTTAAAA 


TATGGAGAAA 


GTCCAATTGA 


AATTGATGCA 


AATTCAAGAA 


TAGAAAATTA 


100200 


TAAAAACCAG TATAGGGGGA AAAATAATGA 


AAAGAATTTT 


AGCAATGCAT 


GATATTTCAA 


100260 


GCATGGGAAG 


AACATCTCTT 


ACAATATGCA 


TACCAGTAAT 


ATCTTCGTTT 


AATATGCAAG 


100320 


TTTGTCCTTT 


TGTGACAGCT 


GTCCTTTCTG 


CTTCCACAGC 


TTATAAAAAA 


TTTGAAATAG 


100380 


TGGATTTAAC 


CGATCATTTA 


GAAAAATTTA 


TCAATATATG 


GAAAGAACAA 


AATGAGCACT 


100440 


TTGACATACT 


CTATACCGGA 


TTTCTGGGAA 


GCGAAAAACA 


ACAAATAACA 


ATAGAGAAAA 


100500 


TAATTAAATT 


AATAAAATTT 


GAAAAAATTG 


TAATTGATCC 


TGTGTTTGCT 


GACGATGGAG 


100560 


AAATTTACCC 


TATATTTGAT 


AATAAAATAA 


TTAGTGGATT 


TAGAAAAATC 


ATAAAGTACG 


100620 


CAAACATAAT 


AACACCCAAT 


ATCACAGAAC 


TTGAAATGCT 


AAGCAAAAGC 


TCAAAACTTA 


100680 


ACAACAAAGA 


TGATATCATA 


AAAGCAATAT 


TAAATCTTGA 


TACAAAAGCG 


ACGGTAGTTG 


100740 


TTACAAGCGT 


TAAAAGGGGA 


AATCTCTTGG 


GAAACATTTG 


CTACAATCCT 


AAAAACAAAG 


100800 


AATACTCGGA 


GTTTTTTTTA 


GAAGGATTAG 


AACAAAATTT 


CAGTGGAACA 


GGAGATTTAT 


100860 


TTACCAGCTT 


ACTTATAGGA 


TATTTGGAAA 


AATTTGAAAC 


AGAGCAAGCC 


TTAGAAAAAA 


100920 


CAACAAAGGC 


TATTCACCTA 


ATAATAAAAG 


AGTCAATTAA 


AGAAAATGTT 


TCAAAAAAAG 


100980 


AAGGGGTCCG 


AATTGAAAAT 


TTCTTAAAAA 


ATACATTTTG 


AATTTAAATT 


CCATTAAATT 


101040 


CAATTTTTAA 


GATTGAATCA 


ATTTCTTGGT 


ACAAAGGAAA 


TACTGATATT 


GCAATATATT 


101100 


ATTAAAATAA 


AATGTGAAAA 


AATTTATTAC 


AAAGTAAATG 


CTTTATTGTT 


TTCATGAGTA 


101160 


AATAAAAATA 


TGTCAAATAA 


AAAAATAATA 


TTTTTTACAG 


GGGGAGGAAC 


TGGGGGTCAC 


101220 


GTATTTCCAG 


GAATTTCCAT 


CATACAAAAA 


TTAAAAGAAT 


TTGATAATGA 


AATTGAATTT 


101280 


TTTTGGATAG 


GTAAAAAAAA 


TTCTATAGAA 


GAAAAACTAA 


TAAAAGAACA 


AGATAATATT 


101340 


AAATTTATTT 


CGATTCCATG 


CGGAAAACTT 


AGACGCTATT 


TTTCTTTTAA 


AAATTTTACT 


101400 


GACTTTTTCA 


AAGTAATACT 


TGGAATAATA 


AAAAGCTTTT 


ACGTTTTAAA 


AAAATATAAA 


101460 


CCTCAGCTTA 


TTTACGCAAC 


CGGAGGATTT 


GTTTCAACTC 


CTGCAATTAT 


TGCATCCAGC 


101520 


TTGCTAAAAA 


TAAAAAGCAT 


AACCCATGAA 


ATGGATCTAG 


ATCCCGGACT 


TGCAACAAAA 


101580 


ATTAACTCTA 


AATTCGCAAA 


TAACATACAC 


ATAAGCTTTA 


AAGAAAGTGA 


AAAATACTTT 




AAAAATTACA 


AAAACATTAT 


TTACACAGGA 


TCTCCTATAA 


GAAGAGAATT 


TTTAAATCCA 


101700 


GATCCCAAAA 


TAATCAAACA 


ATTGACACAA 


AACACTAACA 


AACCAATTAT 


TAGCATACTT 


101760 


GGGGGATCTC 


TTGGCGCTAA 


TGCTTTAAAC 


AACCTTGCAC 


TCTGCATTAA 


AAAGGATGCT 


101820 
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GAAATCTACT TCATCCATC^WTCGGGGAAA AATTTAAATG ACCTAAGCgJPEaAGAATTAC 101880 
CTTAGAAGGC AATTTTTTAA CGCAGAAGAA ATGGCAAGTA TAGTTAAATT TTCTAATCTA 101940 
ATAATAAGCA GAGC CGGAGC TGGAGCAATA AAGGAATTTG CAAATGCTGG TGCATGTGCA 
ATTTTGATTC CATTTAAAAA AGGCTCTAGA GGAGATCAAA TTAAAAATGC AAAATTACTA 
ACAAATCAAA ATGCCTGCAT TTATATAGAT GAAGATGAAA TTTTAAATAT AAATATTTTA 
AAAATTATAA AAAAAACTTT AAAAGATAGA GAAAAAATCA ACTCTCTCAA AGAAAATATC 
AAAAAATTCA ATAATAAGCA TTCTTCAACT TTAATAGCCA AATTGCTAAT AAAAGATATT 
AAGGAGACAA AATC TAAATG ATAATAAACG ATCCTGTAAA AATAACTGGA ATAGTAGACA 
TATTAATAAT AATAATTTTT ACATCTTTGG GATTTAGAGG ATTTTTAAGA GGATTTATTA 
AAGAAATTAG CGGATTTGCT GAAGTTTTTG TTTTAATCCT ACTGCTTTAC AAAAAAACTG 
AAGAATTTAG AAGGTTTGTT GAACCTATTA TTGAGCTATC CTACATTCAA GCACTACTTG 
TATTTTTTTT GCTTATACAT ATAGGATTTT TAATACTACA ATCCCTAATA GAATCAATAA 
TAAGTCAACT TAAATTGCTA TTCTTCAATA GAATACTAGG CTTAGTGCTT GGCCTACTTG 
AAGCTTTTGG AATAATTGCA ATCGTGGTTT ACATAATACA CTCACAACAA ATATTTAAAC 
CTGAATATTT CCTAAAAGAA AGCAAACTAC TTGATTATTT AAATCCTGGA ATAAACTATC 
TCTTTAAAAT TTCAAAAACA AAATAAGGGC CAGCAATGAC AATGCTTCCA AAAATTGCAA 
AAGAGATAAT AAACGAATAT GATCAAAAAA T AC TGCC AAA TGCAATTCTT TTAC TAGGAG 
AAAAATTTTC TTCAAAAAAG ATTAGCGCAA TTGAGCTTGC AAAAAAAATA TTAAACGGAA 
AAAACTTAAC AAACCCTAAT TTGCTCATTT TCTCAAATCT TGACACAGTA GAAGCAAAAG 
CACATCTTTC TACAAATTCG CAAAAGATAG CAAATAAATA CCTAGAATAT ATTAAAACTG 
TAATTTTTAC CAAATGTTAT TTCAGCAATG AAAAAAATTT AAAAAAAATA GAAAAAAATA 
TCAACTACAT TAATTCTGTT TATTATGAAA AAGAATACAA TGAAAACATA AAAAATGAGC 
TTATAAAAAA TATAGAAAAT ATAAACAAAG AATTAAATCA TAGCATTACT GTTTATGATG 
TAAAAAAAAT TCAAACTTGG ATTTTTTCTG AAAAAGAAAA ACCAAAGGTA ATCTACATAA 
ACGAAATCGA AAATTTATCA TTTAATGTCC ATAACTCACT TTTAAAAATA TTGGAAGAGC 
CTCCCTCAAA TATTTACTTT ATCTTGGCAG CAAGAAATAA AAACAAAATA CCAAAAACAA 
TACTTTCAAG ACTTAGAGTC TACAATTTCG CAAAACTAGA CAGAAGCTTA GAAATTCAAA 
GATTTAAAGA AAGCTTTCTA ATAAATAAAG ATATAACAAT TGAAGAGTAT TTCGCCTCAT 
TTTACAAAGA AGAAAGCAAA AAAATAAAAA AAGAATTGGC AAAAATTCTA AATATAATAA 
AAGAAAAAAA ATCCATATTT AATCTTGAAG AAGTCGACTT TATAAAAGAT GAGCAAAGCT 
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TTAAAATATT TTTAAACGAA CTTACAATTA ACATTAGAAA AGATTTTTTA GAAAACAAAA 
TAGATATTAA TCAATATCTA AAGTACACAG AGCATTTGAA AAATATTTAC AAATATCGCC 
CCTATAATCA AAATAAAAAA TTAATAATAG AAAACTTAAT GCTAAATTAT GAGGAGATAT 
GAATAATTTT TTCAAAAAAG CTTTAACAAA GCTAAACAAA TTATCTAACG AACAAAAAAC 
TAAATTTATT GAACAAATTT ACAAAAAAAT AGAAATATAT GACGGAATAT TTGCATCAAT 
TAATGAAGGA ATCATTGTAC TTGACAAACA AAACAATATA ATCTATGCAA ACAAGATTTT 
ATACCAAATT TTAGCTTTAA CATCTAAATC AAAAATAGAA ATTCTTGATG ACATTCAAAT 
TCCAAACTTA ATAAATTTAA TAAAAGAACT AGTTAGAACA GAAGATAAAA TAATAGGATT 
AGAAGTTCCA ATCTCAAACG GCATATATAT TAAAATCTCA TTTATGCCTT ATGTAAAAGA 
AAAAAAACTT GAAGGCAACA TTATTTTAAT CGAAGACATT AAAGAGAAAA AAAAGAAAGA 
GGAACTATTT AGAAGAGTTG AGGCTTTGGC CTCTTTTACA AGGCATGCAA GAAATATTGC 
CCATGAAATC AAAAAC CC AC TTGGAGCAAT CGATATAAAT TTACAACTGC TAAAAAAGGA 
AATTGAAAAA CAAAAAATGA AAAATGGTAA AGCTGAAAAT TATTTTAAAG TAATAAAAGA 
AGAAATAAAC AGAGTAGATA AAATAGTAAC AGAATTTTTA CTAACTGTCA GACCAATAAA 
AATTAACTTA CAAGAAAAAG ATATTAAACA AGTAATAGGC AGCGTATGTG AATTGTTAAA 
TCCTGGATTA GAAAATAAAC ACATAAAACT ATTGCTTAAT TTAAACAAAA TAAGCAATAT 
TCTCATTGAT GAAAAACTAT TAAAACAAGT TATTATAAAC ATCGTTAAAA ACGCAGAAGA 
AGCACTGCTT GAAACAAAAA AAGAAATAAA AAAAATAGAA ATTTTTCTCT TCGAAAAAGA 
CAATAAAATA CATATCAACA TAAAAGATAA CGGAAACGGA ATAAAAGATG GGGTAAAAGA 
GGAAATATTT AAGCCTCAAT TTAGCACAAA AGAAAAAGGA AGTGGAATAG GACTTACTAT 
TTCTTATAAA ATAATAAAAG AGCTTGGAGG TGAAATTTTT GTGGAAAGCA AAGAGGGCAA 
AGGCACTATT TTTACAATTA CGCTGCCTAA ACTAAATAAA AAAAATATTT TAATTGAAGG 
GTATTGAAAA TGAGCAAAAT ACTTGTAGCT GATGATGAAA AGAATATTAG AGAAGGAATT 
GCTACf TATC TTGAGGATGA AGGATATTTT GTTTTCACTG CTAGTGACGG AGAAGAAGCT 
CTTGAAACAA TTGAAAATGA AAATCTTGAT GTAATAATAT CTGACCTGAG AATGCCCCAG 
ATATCTGGAG AAAAATTGCT CAAAATAGTT AAAGAAAAAA ACTTGGGAAT ACCTTTTATT 
ATTCTAACAG CCCACGGAAC AGTTGATTCT GCTGTAGATG CCATGAGAGA GGGTGCTTAT 
GATTTTTTAA CAAAGCCCTT AGACCTTGAA AGACTTTTGC TAATAATAAA AAGATCACTA 
AATAAAAAAG AAAATAACGA TAATGAAAAT GCTAATTTAG AAAATATACT AATAAGAAAA 
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GATCTAAAAT ACTATGAAAA^KtCATGGGA AAATCCCTAT TAATGCAAaJ^KtTTTTGAA 105420 

CTTGTAATAA AAATAGCAAA ATCAAATGCA TCTATTCTTA TAACGGGCGA AAGCGGTGTT 105480 

GGTAAAGAAA TAATAGCAGA TGCTATTTTT GATCTTTCAA ATAGAAATGA CAAACCATTT 105540 

ATAAAAGTAA ATTGCGCAGC ACTTTCTGAA AGCATTCTTG AAAGTGAACT TTTTGGCCAT 105600 

GAAAAAGGAG CATTCACTGG AGCAATTTCC AAAAAAAAAG GCAGATTTGA ACTTGCAAAC 105660 

AAAGGCACAA TTTTTCTTGA TGAGATAGCA GAAATTTCAC CTGAAATTCA AGTCAAGCTT 105720 

TTAAGAGTAC TGCAAAACAA AACTTTTGAA CGTGTTGGGG GAGAAGCTAC AATTAAAGTT 105780 

GATATCAGGC TTCTGGCTGC AACAAACAAA AACATTGAAG AGGAAATTAA AAAGGAAAAA 105840 

TTTAGAGAAG ATTTATTTTA TAGATTAAAT ATCATTAATA TAAACATACC GCCTTTAAGA 105900 

GAAAGAAAAG ATGATATATC TTATTTAACA AACATACTAA TAAAAGACGT CGCAAAGGAA 105960 

AACAATAGAG AAGAAAAAAC TCTTTCTAAT GATGCAATGA AAGCTCTCTA TT ATTACG AT 106020 

TGGCCAGGAA ATATTAGAGA ATTAAAAAAT GTGCTTGAAA GTGCATTAAT ATTATCAAAA 106080 

GGCAAACAAA TCACTAAAGA AGATTTGCCA GCAAAAATCA AAAATAATGA AAATCTTATA 106140 

TTTAAAATAA CACTACCAAT AGGAATTAGC CTAAAAGAAG CTGAAAAAGA AATAATAAAA 106200 

CAAACACTTT TTGATTCCAA AAACAACAAA AGCAAATGCG CCGAAATACT AAAAATAGGA 10 6260 

AGAAAAACTT TACACAATAA AATAATCGAA TATAATATTG ATTAATAGGA TTTATTTTAA 106320 

ATTATTAAAT TATAATGGGT ACAAAAAAAT AATACTGCTT TAAATTCCAT GTATATTTTT 106380 

GAAACCAAAA AATTTTTTAA TGCCAATAAT TATATTAAAA TGAAACACTT TCTTTTAAAA 106440 

TCATGGCGCA AAAGTGTAAA AATATTTTTA TCAAACAAAT AATTATACAC CATTATTTGT 1065 00 

TAATAATCAA TACAATTTGA TAATTTAATA TATTTAGCTG GCTACAGAGC CTGACCTTAC 10 6560 

TTTAAAAACT TTAAAGGGTT AATAGGAATA TTTTTTTTTA ATATTTCAAA GTGCAAATGA 10 6620 

GGACCAGTTG CGCGACCCGT TTGCCCAACC nTTCCAAGAA ATTCTCCCGA TTTAACAAAA 10 6680 

TCACCTATCT TTACAGAATA TAAATTTAAA TGCCCATAAA GAGATTTAAT ATTATTTTTG 106740 

TGACCAACCA CAACAAAATT CCCATAAAGA TCATTGTATC CAGCTTCAAT AACTATTCCA 106800 

GAAGAAGAAG ATACACTTCA GCATTCATTG GAGCTGCAAG ATCTATTCCT GTATGGAAAC 106860 

TTTTGTTGCC AGTGAAAGGG TCATTTCTAA ATCCAAAATC AGAACTAACA ATAAATTTTT 10692 0 

TTAAAGGAAA AATAAAATTG GCATTTAAGA AAAAAAGCAA TTCTGTGCCT GAAAAAAGTC 106980 

CAAAATCTGG ATTCTTAACA AAATCAAAAA AATAAAATTC ATAAAC TCTG TCGTTCCTTT 107040 

TAATTTTTAC CTTTTCAGCT TTAGCAAGAT CCCTTGTTGC TAAAAGCAAA TTATTAAATC 107100 

TATAATCTTT ACTATCAAAA ACAAAAACTC CTTTTTTACT GGGAATAAGA ATCTCTTGCC 107160 
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CAACACTCAC AGCAGGAGAA TCTAATAAAT TAATAGTAGC AATGCCGGAC TGCCATCCAT 
TTATTTTATT GGCAATTTTA AAAAAAGTAT CCCCTTTTTT AACTTTATAT GAGTAAAAAA 
ACAGAGGAAT ATGTTGTTTT TTGTTATATT TTAAAACTTT AATTTTAAGA TCAGAAAAAA 
CAGGATCTTG CCTTGAGAAA TTTTTTATTT CTGGATAAGA AAAAACATAA ATTATTTTTA 
AAAAAAAGAA ACCTGCATTA AATAATAAAA AAATTTTACT CATACTATAA ATTCTTTAAC 
GATTAATTAA TCAAATATAA TAAAAAACAT AAAAAAAATA AATCCTATTT GGACTTGCAA 
ATAATAACAA ACTGTAATAA ACTGTCTCTC AACATGGAGC TAAACGAATA CCAAGAAAAA 
GCAAAAAAAA CTGCTAAATA CAAAAATAAA AAAGAAGAAT TAATTTTAAC AACACTTGGT 
CTTGCTGGTG AAACTGGAGA AGTTGTTGAA AAAATAAAAA AATTGGGAAG AGATAAAAAT 
TACATTATTG ATGATGAGTA TTTAATATCA ATTAAAAAAG AGCTTGGGGA CGTATTATGG 
TACTTGTCAA GTTTAAGCAA TAATTTAGGC ATTACGCTTG AAGATGTTGC CCTCACAAAC 
CTAAAAAAAA TACAAAAACG ACATGAAAAT GGAACAATAA ATGGCGAAGG CGATGACAGA 
TAAGGCATTT AAATTTAAAA TACTCAAAGT TTAAAAATAA AAATGCTTAA TATTTATATC 
AAGGGAATTT TACTTGGAAT TGCAAACATA ATCCCAGGGG TTTCTGGGGG AACGCTGGCT 
TTAATATTAA AAATTTATTA CAAAATAATA AACTCCATCT CAGAAATCTT AAAGCTCACA 
GAAATTAAAA AAAATTTAAT GTTTTTAACT ATTTTGGCAA CAGGAATGTT AACCTCAATA 
TTATTAACTG CAAAAATATT T AAAAC TT AT GCTTTTGACA ATGGAATAAT AGAAGCACTG 
CTAATAGTAT TTTTCATAGG ATTAGCATTT GGAAATATAC TAACACTAAA AACAGAAATA 
TCTATAAAAG AAATAAATAG TAATACAAAA ATATTAAATA ATTTATTGTT TTTCATTGGT 
ATGAGCATTA TTGTACTCTT CTTAATACTC AAAGAATCTA ATATACAATT GCAAAGTACA 
ATACCTAAAG ACAAAAACTC AATAAAATAT TACTTATTAT TGATATCCTC TGGAACAATA 
AGCGGAGCAT CAATGATCTT ACCGGGAATC TCAGGATCTG CAATGCTTTT ACTGCTTGGC 
TTTTATAAAG AAATAATACT TATTGTGTCT GAATTTAACA TTATTCTTAT TACAATATTT 
GGAGCTGCTG CAACAATGGG AATAATTACA TCAATATTAA TAATAAAGAA AATAATAGAT 
AAGCACTTAA ATAATTTTAT TTATTTATCA AAAGGCTTAA TTTTTGGATC AATTCTACAA 
ATGATATTAA TTGTATTAAA ATTGAACTTT AAAATCGGCT TTACATCTTT TACATCTCTG 
GGAACATCAT TCATACTGGG AATCTTTATA AACAAAAAAT TGGCTGAGAA ATATAAATAA 
AAAATTTAAA AATACCGAAG ACCGGACTTG AACCGGTACG AGCTTCCTCC TCAGGATTTT 
AAGTC C TGTG TGTCTACCAA TTCCACCACT TCGGCATAGA ATAATATAAT AAATAATAAT 
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ATATTGCAAT GTCAAGTTAA^WkAGAAAAT AAATATAAAA ACTCAATTG/^PKaCTATTTT 1089 60 

TTGAGGAAAA TTCTCAACAA C AT AATTGC T ATGCAAATAA TCTGCAATTA CAAACATAAA 109020 

TGAATATAAA GAAACCCTAG CAAAAAAATT AGAAAAAACT CTAAAATCAA TTTTATTTGC 109080 

TAACATTGAA GTGTAGTTTG AATCATTTAT TAAGCTAAAA AATCCATTTT TGATTAATAA 109140 

ATCAAAAATA ACAAATTCAG ATTCATCTAA AATAAACATC AATTTATCCA TTTGGGCTTT 109200 

TAAATATAAA ATTTGTTCAC TAATTTGCAA ATTGGCAATA CTTTTAATAA AATCCTTGCT 1092 60 

TAAATTATAA ATAACAATAC CATAGTGATT ATTCTCAAAT TTAAAATCCA CAAGCAATCC 109320 

TAATTTTAAA TCGCAAGAAT CTTGCATTAC ACTTAAAACT TTAATATTTA TTTTGCGCTT 1093 80 

AGAAAGAATG CTAACAGAAA GGCCACTACC TTGATCAATA AAATAAGCAT TCTTAAAAGC 109440 

TCTAAATAAA ACTCTATTAT TAATTAAATC AAGAACATTT TTATTCTTAA GATTCTTAAT 109500 

AATTATTAAA TCAAAACTTC GATTTAGAAT ATTATAAGCA GAATCAAGTA ACCCTTGAGA 109560 

AAAATCATCA TTTAAATTAA AATTTGCAAC CTTAAACTCC AAAACACTAA AATCTGTATC 109620 

TACAACATTA CAGGAAAAAA ATAAAACACT AAGCGGAAAT AAACTCTTCA AGTTGATACT 109680 

TTGTCTCAAC AACTTCAAAT ACAAGCCCAT ACTTTTTTGC AGCACTAGTA TCCAACCAAA 109740 

AATCTCTATC AGTATCCTTT TCTATTTTAG AAATTTTTTG ACCCGTTTCT TTTGAAATAA 109800 

TATTATTAAG TTCTTTTTTA ACTTTATTTA ACTCATTAGT GTAAATCTCA ATATCTGTAG 109860 

CAACTCCCTT AAATCCACTC AAGGGC TGGT GCAATAAATA TCTGGCAAAG GGCAGTGAAA 109920 

ATCTATTTTC TAATTTTGCA GCCAAAAAAA TTAAAGCAGC AGCGCTAGCA ACAAGCCCTA 109980 

CTCCAACTGT AAAAACTTTA GGCTTAACAA AGCGAATCAT ATTAAAAATA GCAAATCCAG 110040 

CATCAATGTC GCCTCCTTCT GAATCAATAT ACACAAATAT AGGCTTTTTA AAATCTAGAG 110100 

CCTCTAGCAA TAATATTTTT TCCTGAAAAA GCCTGGAAAC ATCCTTGGTA ATCTCACCAG 110160 

CAATAACTAT TGATC TGCTC TTTAAAACTA ACTTCAATGA TTTATCATGC AAAACACAAG 110220 

CATCATTATC TTCTTTCCCG GTCATAAAAC ATCCCTTATA CAAAAACATA ATGATATATT 110280 

ATAATTGAAA ATAAAAGGTT TTTAAATGAT AAAAAAGCAC AAAAATTAAA CAATTGCACT 110340 

TAATTTCTGA AAAGCAAAAG ACTAATAAAT CTTTAATCAA GCTTCATTAA AGTTAAAAAA 110400 

TACTCTAAAT TTTACAAATT AAGTAAAATT AAAAAGGAGT TTATAATGCA CCATGAATTT 110460 

GCGGTTATCG GAGGGGGAAT AGCGGGAAGC ACCGTTGCTT ACGAACTGCT TAAAAGAAAT 110520 

AAAAAAGTAA TTCTTTTTGA TAATGAAGAT ACAAAAGCAA CAATGGTAGC GGGCGGGCTT 110580 

ATTAATCCTA TTATGGGTAG AAAAATGAAC ATTGCCTGGA AAGAACCACA TATTTTTGAA 110640 



TTTGCAAAAA AC T ACTATC A AGAAATTGAA AAAACCATTA AATCCAAATT TTTTATAGAA 110700 
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AAAAATATCT TTAGACCCTT TACTACTGAA AATCAAAAAA ATGAACTGAT TGATAAACTT 
GAAAATAATA AAAACATAAC AAACTTTATT TTAAAAATAC AAGATGGAAA AACTTACAAT 
TTCTCAAACG ACTCTAACGG CGGAATGATA ATAAAAGGCG CCAGGGTTAA TACAAAAACA 
TATATAAAAA ATATTAAAAA ATACTTAATC GAAAAAAATT CTTACATAAG CAAAAATATA 
AACGAAAATA AAATTAAACT TGGAGAAAGT TTTTTCAAAA TAGAAGATTT TAAATTTGAA 
AAATTAATAT TTGCAAAAGG GTATAAAGAA AAACTCAAAG GATTTTTTTC TTATCTCCCA 
TTTGAGCCTG CAAAAGGCGA AATCATTATA TTAGAATGCA AAAAATTAAA CTTTAAAGAG 
ATTTACAATA GACACATATC TTTAATTCAC TTAAAAGGCA ATAAATTTTA CCTTGGAGGC 
ACTTACGAAT GGAACACTTG GAATACACTT ACAAATGAAT GGGCAAAATT AGAGCTATTG 
AAAAAATTTA AAAAAATAAC AAATCTAAAA TGCAAGGTCA TTGCTCAAAA AGCACATATA 
AGGCCTTCAA CTCTTGATAG AGAACCTTTC TTGGGAGAAC ATCCTAAGCA TAAAAATATC 
TTTATATTAA ATGGTTTTGG AACAAGGGGC GTATCTATGG CTCCATACTT ATCTAATTTA 
TTAGTTAATA ATATTGAAAA AATTGACAAA ATTCCAAATC ATTACAATAT TAAAAGATAT 
GCAAAATATT ACAATATTTT GGATCATTCT TAAAATCAAA ATTTTTAAAT CCATACATAC 
TGACAAACGA CTACTATTAA TATTTCTAAA TTCATAAAAA AATAATATAA TGTTTAAGTT 
AAGCTAAAAT AATTCTTATC CAAAGAGAAA CTAAGAGTGA AACAAGATTT AACAAAGCAA 
ATAAAATTAA TTGACACTTA CAAAACAAAC CAGGAGAATA ATC TTTGGG A TTTAATATTA 
ATATCATAGG AACTGGAGGA ACAAGGCCAC TCCACAATAG ATATTTGTCA TCCGTACTAA 
TCGAATACGA TGGAGATAAC TTTTTGTTCG ATTGTGGTGA AGGAACCCAA ATGTCTTTAA 
GGAAACAAAA AATATCCTGG CAAAAAATAA AAATGATTTG CATTACACAC TTACATGCTG 
ACCACATCAC GGGACT AC TT GGAATAGTAA TGCTAATGTC ACAAAGTGGA GAAACAAGAA 
AAGAACCATT AATAATCGCT GGACCTGTTG GAATAAAAAA CTATACACAA GCTAATATAA 
ATATGCTTAA AATATATAAA AAC T ATG AAA TAATTTATAA AGAAATAATC ATAGATAAAA 
CCGAAAAAAT AATATATGAA GATAAAACAA AAAAAATTGA ATACACTAAA CTAAAACATT 
CAATAGAATG TGTTGGATAT TTATTTATAG AAAAAGATAA ACCCGGCAAA TTCAACACAG 
AAAAAGCAGA AGAGCTAAAT ATTCCTAAAG GGCCTATTAG AAAAGCCCTA CAAGATGGAA 
AAGAAATATT GGTAAACGGA AAAATTATAA AGCCATCAGA AATACTTGGA AAATCTAAAA 
AAGGACTAAA AGTTGCATAC ATTACAGATA CTGGTTATTT TAAAGAACTC ATACAGCAAA 
TCAAAAATTT TAACCTTGTA ATAATTGAGA GCACATTTAA AAATGAGCTA AAAAAAGAAG 
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CCGATAAAAA ACTTCACTT/^WaGCTGGCG GGGCTGCAAA TATTGTCAAG^^AGCAAAAG 
TTTTACAAAC AGGACTTATC CATTTTAGTG AAAGATATAC ATTAAGAAAA GATCTTGAAA 
ACTTACTAAA GGAGGCAAAA TTGGAACATC CAGACGGAGA AATTTTTTTA ACAAGAGATG 
GAATGAGGCT TGAAGCAAAC AAAAATAACT TTATTATTAA ATAGGAGGGT ATATGATAAA 112680 
TGTAGAAAAA GTTACTAAAA TGTATGGGCC ATTTACAGCA CTATTTAATG TTAGCTTTAA 112740 
GGTTGAAGAA GGCGAAGTAC TTGGTATACT TGGCCCAAAC GGAGCCGGAA AGTCCACATT 
AATCAAAATC TTAACATCAT TTCATTATCC AAGCAAAGGT AATGTAAAAA TTTTTGGAAA 
AGACATTGTA GAGCATTCGA AAGAAATACT ACAGCAAATA GGATATGTTC CTGAAAAACT 
AGCTCTTTAT CCAGAGCTTT CTGTTAAAGA ATATTTAAAG TTTATATCAG AAATAAAAGG 
TGTTAAAAAA TTAAAAAAAG AAATTGACAG AGTAATAAGC ATATTCAAAT TAAAAGAGGT 
TGAAGATAAG CTGATTTCTC AACTTTCAAA AGGATTTAGA CAAAGAGTAG GAATAGCTGG 
CGCTTTAATA AACAATCCTA AACTTGTAAT ACTTGATGAG CCAACAAACG GTCTTGATCC 
AAATCAAATA ATTGAATTTA AAGAATTTTT AAGAGAACTT GCAAAAGAAA GTACAATATT 
ATTCTCTTCG CACATACTAA GCGAAGTAGA ATCTATTTGT AAAAGAATAA TTATTGTCAA 113280 
CAACGGAGTA ATTGTTGCTG ATGACACAAA AGAAAATATT ATTAAAAATA AACTTAAAGA 113340 
GATTGAAATA GAATTAATAG TTTCAAAAAA ATC TGAAAAT GAGAAAAAAA TTTTCAACAG 
CAAAAATGAT ATTTTTTCAT TAATAAAGCT TGAAGAACAC GAAAAAGACT TAAATATTTC 
ATTAAAACTA TCTCAAGGCA AAACAGAAGA AGATCTCTTT AGC T AC AT AG TAAAAAATAA 113 52 0 
TATAATCTTA AAAGCAATGA TTCCAAAACA TGAAAGCCTT GAAAAGATAT TTAGCAAATT 113580 
AACCAAGGAG- AGAGAAAAAT GAAAATAGAT TTAAAGCAAT C TTT ATCGCT TTCTAAAAAA 
GAACTAAAAA TATTATTTGG AACCCCAACT GCATACGTTG TGATGCTATT TTTTTTAATA 
TTCATAAACT TTTCATTTAT TTTTTTATCA GGATTTTTTA TTAAAGACAA TGCATCTCTT 
ACCTCTTATT TCTCTTCAAT GCCTATTATT TTAATGTTGG TACTGCCAGC ACTTAGCATG 
GGAGTATTCT CAGAAGAACA CAAAACAGGA AGCATTGAAC TTCTTTATGC TCTACCGCTA 
AGTCCTCAAG AGATAGTCTT GGGCAAATTT ATTACGCTTA AAATATTTAC CTTAATACTA 
TTCTCACTTA CCCTACCTCT TACAATAATG ACAATTTTCA TGGGCGAATT TGATCTTGGG 
ATAATATTGC TTCAATATCT AGGAATAATT CTTTATTCTC TTTCTGTGCT AAGCATGGGA 
ACATTTATAT CCTCCATTAC AAAAAGCCAA ATAGTCTCTT ACATTCTTAC CGTATTTACA 
C TG AT ATT AA T AC T ATTTTC TGGGAAATTG GTTATGATCT TTGGAAAAGA AAATATAATA 
GGAGAAATAC TTAATTTTGT TTCAATAACC AATCACTTTA GCTATTTTAA TATGGGTATA 
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TTAAACTTAT CAGACTTTAT TTATTTTATT ACATTCACAG TCACATTTCT AATACTAAGC 114300 

ACACACAGCA TAACACTAAA AAAATGGAGA TAAATTTATG AAAAACAAAG AAAATGAAGT 114360 

TTTAAACCTA ACTTTGAACC TTACAATAAT CTTTTTGATT TTTTGTAATA TATCTATTTy 11442 0 

CATTTTTAAA ATAGACTTTA CAAAACACAA AGCTTTTACA ATATCTAAAG TTACAAAAAA 114480 

TTTGTTCTCA AGTGCAAATG AAACAATATA TATAACATAT TACAATTCAG GAAGCCTTGA 114540 

AAACTATTTT GCTTTTCCAA ACCAAATAAA AAATTTTTTA ATAAGTTTTT CTGATGCTTC 114600 

AAAATGTAAG GTAATTTATA AAGAAATTGA CGCTGATAAA ATTTCAACAC CATTAGAGCA 114660 

CATTGGTATT CCCTCTCAGC AAATCGACTT AAGAGATATT AATCAGCTTT CAATACTCAA 114720 

AATATACTCA GGAATTGAGA TTATTTACGA GGGAAAAAGA GAAGTAATAC CGGTTGTAAC 114780 

AGAAATCAGC AATCTAGAAT ATGACCTTGC AAATGGACTT GACAAACTAA TAAATAATAC 114840 

CAAAAAAGTT TTAGGACTTG CTTTTGGAGA CAGCACTTTA AAAGAAGCAC ATAAAAACTT 114900 

TTCCGAAATA ATGAAAAAAG CATTTGGAAT TGAAATAAAA GAAATAGATT TAAAAACTGA 114960 

AAAATT AGAA GACATTAGAA AAGATATAAA TGGATTATTC ATTATTGGCG CTAAAGAAAT 115020 

TGACGAAGAA ATTGCAAAAA AAATTGACGA TTTTATTGTT AATGATGGAA AAATATTTGT 115080 

TGCAACAAGC ACAATTGACT ACAATCCTCA AAATCCATAT GGCATAACTC CTATTAAATC 115140 

CAGCCTATTT GATCTATTTG AAAGTTATGG GATAAAATAC AACGATAATA TTATTCTTGA 115200 

TAAAAGAGCG CCCACAATCT TTTTGGGTGG CAATTTCCAA ACTTACTATC CATGGATCTT 115260 

AATAGACAAA AGCAATATTG TAAAAAAAGA CATGCCATTG CTTAAAAATT TTTATACCGC 115320 * 

TACAATTCCT TGGAGCAGCT CATTAGAACT TATAAAAAAA GATGAAACAG AAGTAAAATT 1153 80 

TTTAC CTCTA TTTGCAAGTT CCAAACAATC ATGGCAAGTT AAAGAACCTA ACCTTTCAAA 115440 

CATATCTTTG AATGCATTTG AAGTTCCAAA TAAATTTGAA GAGAATAAAA CTAAAATACT 115500 

AGGATATGCA ATTGAAGGAA AAATTAAAAG TCCTTATAAA GATCAATATT CCAAAAATTC 115560 

TAAAATAATC CTAACAGGAT CAAGCATGAT ATTTAGCGAT TATATGTACA ACGGGTCTCC 11562 0 

ATCAAACTTT GAACTATCAG GAAGAATTTC GGATTATTTA ATGCAAAAGG AAGAATTTTT 115680 

TAATATTAAG TCCAGAGAGG TACGAGCTAA ATTAAAATTT GCAAGCTCTT CAAACGAAAT 115740 

GGTCAATGCA AAGTTTTCAT TAATAATTGT TAACTTAATT ATTCTTCCAA CAATAATATT 115800 

AATATTTGGA CTTGTTAGAT TT^CTAGAAA AAGAAAAGCA AATTAATAAG AACAAAGGAG 115860 

TGTTTATGAC AAAACCAAAA ATATTCTCAA TCAATAAAGA AAAAATAAAA ATATTGATAA 115920 

TAGTAGTGTT AACATCTACA TTCTTATTGG GAATAATTTT TTCAAATGAA AATAAAGTAG 115980 
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CAAGGATCCT TGAAGAAAAA^WrTTTGATT TTGACTTTAA TTTAATTTCT^BCaATTGAAA 116040 
CAGAGCTTGA AGGAACGCTA ACAAAACTTG GCAAAGATTG GATTTTAACA TACAATAAAC 116100 
AAAATATTCC TGTTGATAAC AAAAAAGTCA ACTCTCTAAT CAAAGCATTA GACGAGCTTC 
AAAAAAACAA GCTTGTAAGT AGAGATCAAA AAAAACACAA GGAACTAGGA ATTGGAGAAA 
ATCCAAGCTT TAAATTATTT GACAATAATA ATAAGCTGTT AACAGAAATT TTTGTTGGAA 
AATCAGGAGA AGGCGATTCA AGACTGGCAT ACATTAAAGG TAGTGACGAA AATGTTTAGT 116340 
TAACAAAAAA CATTTTCTTA TCATACAAAG GAAATTCTTA CAATACATTT TCAGATACTA 116400 
CATTGTTCCA AGAAAAAAAC ACAAAATTAG AAAATTTATC ATTCAAAATA ATAAGAAAAT 
TAAACAAGGA AAATGAAAAT AACATAAATA ATAACTATGA GATTATCAGT AAAGATGGCC 
TTTATTTTTT AAATAACCAA AAAATGACAA AAGAAAGGCC TTTAAATATT ATTGCTGAAT 
TTAAAGCTGA CGGACTTGAA ATTGATAAAT CTAAAATAGA TGATTATAAT CTTCAATACA 116640 
AAATTGAAGT CAAATGGAGC AATAAAAGTG TCAATAATAT TGAAGTTTAT TTTAATAAAA 116700 
ACGAAGAAAA TGACAAAGAC ATATTAATCA AAAAAGATAA AGATGAATAT TACTACACGA 
CTAGCAAATG GACTTTTTTT GATGTATTCG ACTTAGAAAA AAAATTAACA GAAAAAGATG 
ATATTTCTAG CAACGATAAT CAAGAAGATC ATCATGAACA TCACAACAAT GCAGATTAAT 
CTTGCTATAT ATAAAAAGCA TTAAAAGAAA AACATATAAA AATAAATATA ATTAAAATAT 
ACCATGACAA AGACAACATT TTATCAAAAG ATAAGATGTT GTTTGGCTTT ACTGATATAT 
CTTTTAAATT AAAATTAATA TCAAACAAAC CGCTCAGTTA TCAAATATTA ATTTTAAGAA 1170 60 
TTTTTATAAA AAATAGAACT TAAACGAATG GATTTTCAAC CTTTAGTAAG TAAAAATTTA 117120 
ACTTTTTTTA AAACTTCATA CTCTTGTTTA ATTTTAAAAA TATTTCTATT AGGATTTAAC 
TCAAGTTCAC TTTCTACCCT ATCAATAAAA TTTATTAATA TGGCTTTTTC ATCGCTATTT 
AGCACAACAT TACTTAAAGA TTCTTCAGCA TTAAATTTGC TTACAACCTT TTCAACATCA 
TTAAAATTAA AATTAGAAGG AAAAGGGTCC TTAGTTATGC TTTGAGAGCT GGGCTTTTCA 1173 60 
TAATTAGCAA CATCATTACT TTGGTCTTGA TCTGATTTAT CTCTCAAATT ATCATGCAAA 117420 
TTTTTCTCTC CAGCATTAGA AAAAGAGCCT TCGGCAACCG TAGAATCCAG ATCCTCTTTT 117480 
AAAATATTTT CAAAATCATC AAATTCTTTT GACTCAAAAC GTTCACCAAA GCTTTTAAGA 117540 
ATCGCACAAT TATCACCATT CTCATGCTCA ATGTTTTCAT TCTCATTAAT TAAATCCAAA 117 600 
CGTTGTTTAT TA^PTATCTAC AAAGCTCTCC AATTTTCGAG AGTTATCGCA ATTAGCAATA 117660 
GAATAGGGAT CTTCTGCTCC CACTTCATGT TCCAAAGAAG AACTATTATC CAAATTTTTA 117720 
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TTAACAGAAT CTAACAGTTT ATCTGGCTCT GTAGTAAAAC TTTCCAATTT TTTACTGTGA 1177 80 
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TCGTTATTAT TATTTAACTC CATATTTCCT ATGGACACAT TATCACTGTC AACATTAATA 
TCCTCTTTTT TAAGAATATT ATTAGAAAAA TTATCAGCCA CATGTAAATT GTTAGTCAAA 
TTCAAATCTT TATCTCTATT AAATTTAACA TCGTGCTGAA CTTCTTCATT TTCTTTTGGG 
GTATAATTGA TACCTTCAAG CAGCGCATCT AATTCTTCTT GACCAATAGA AATATTAGGC 
AAATTATCTT CTTTTTTTGA AAATGAATCA TCCGTCTCGG ATGATTTTTT TTCAAAATCC 
TTCGAATCAT AAGAAATAAA ATCCCTTTGA ACAAACTTAA GACTATTATC AAGCATTTCT 
TTTTCAACTC TCTCAACACG AAGCTCAACA CCCTTTAAAC AATTAATCAG CTCGCCGTGC 
CTTTCTTTTA AGATGTTATC GAGTTTTAAC AGCTTTTCAT CCAAAAAAGT TTTAAGATTA 
TCTGGGGCTA TAGAAACAAT ATCAACAGAC TCACTTCTTA AAAATACCTT TTCAATATCA 
AAATTAACTT CCTTGGGACC TTCTTTGATA AAAAAAACAC TTTCCTTCAT GCCAAACTTG 
CTCTCCAAAA CTACTCAATA AAACTACAAT CTAAAGCAAT TTTAACTACT TGTAAATATA 
GTATATTAAG AATATAATTA CAAGCTATAT GACTATTTAT AAAAAAATTG CAATGTCTTT 
TTACTCAGGA ATACTAAGCT ACTTTATAAT AGCTCCCATA TTTGGAGAGA GAGGATTTGT 
TAATTATCAA AAATTGGATA ACAACTTAAC ATTAATAAAA AATCACATCG AAAAACTAAA 
AGAAATTCAA AAAGAATTAA AAGCAAGATA TATTAACCTA CAAGTATCTA AATCGGAAAT 
TCTAAAAGAA GCTAAAAAAT TGGGCTACTA CCCAAAAAAC TCAACAGTAA TAAAAACCAA 
CAATAATAAA GATCAATATA ACCAAGGGCA AATATTAACC TTACAAAAAC CCCTTTCCAA 
GAATCAAAAT TTTTACCTTA TATCAATAGC AATAGGTTTA ATTTATTATT TTTTATCAAG 
CTGCATTATC CAAACCAAGA AAATTACAAA AATCAATAAA CTTGCTTCCA ACAACTCTAA 
GGATTAGTCT TTATTGAAAA TATTTATTTT TAAAAATACA ATATATTTAT TAATTAATTT 
AATTTGTGCA TCATTTTTTT GCGTATCGTT AGTAAATCTT TTTTCAAATG AACAACAGTA 
TACTCCTTTT GTTAAAACAA ATGTCATAAA AAATTACTTA CAATACATTG GAGTATATAA 
AAGTATAGAA AGATATGCCC TGATACATGA CTTTAACCCT AAATCAAAAT TAGAAAAAGA 
TTGCTTTTTG AAGCATATAG CTGGCAATTC ATATATAATA TACAAAACAA AAAATGAAGG 
AATGCTGTGG GGCGATCATC GATACTCTCT GC TGAGC AAA GGAAAGCCAA CTACTAAAAT 
AATTTTTCAA AAAATATTTA ATACTTTAAA AATCTCAATT CCAGGCGCCC TACTCTCTTA 
TATTGCGGCA ATAATCCTTA TTATAATTTG GAAAATTTAC ATAAAAAATA ATCTAATAAA 
TAATATTCTA GAATATTTAA TGCTATTGCT CCACTCCATG CCAAGAAACT TAACAGTATT 
TTTAATACTG TCTTTAATAT ATTACCTTAA TTTAAATCCA AAAAATTTAA TAATGGGTGG 
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119580 



CAAAACTTTA TCAGAATTTT ACATAAAAGC TGCAAAATCA AGAGGAATAA ATAAATTGCA 119640 

AATAATCTTA AAACATGCAT TAATTCCATC AATAACACCA TTACTCACAA ACATGAGACC 119700 

TATTATTACA ACAGCTTTTT TTGGAGCATC AATGATTGAA TCAATGTTTG AAATTGATGG 119760 

AATTGGGGCC TTATATTTAA ATGCTTTGAA ATTTAACGAT TATGCTATTT CTAAAGATTT 119820 

GATTTTTATT GGCGTTTTCA TTATGCTTAT TCCAAATATA ATAAGAGATA TACTAATTTA 119880 

CAAAATTAAC CCATATAAGG ACACTCTAAA CTAATGAAAA CAGATACAAT AATAAAAAAA 119940 

ATTTATATCG TACTCTTTAA TATATTTATT GTGTTGCTAA TTATTACTCC GTCATTGGTT 1200 00 

AATGAAAATT CAAAAATTGC AATCTATAAA AAAGATCCAA ATAAAGTCTA TTTAAAATCT 1200 60 

ATTAAAAATG TAC CTATGCC ACCCACAAAA GACAACCCAT TAGGAATCGA CAAAATGGGA 120120 

AGAGATATTA TGGGAAGATT AATAATTGCA ACCAGAAACT CTATTTTACT TTCACTAAGC 120180 

TACGCAACAA TTTCTGCAAT AATTGGAATC TTTATTGGAA CAATCATTGG CATGTTTAGT 120240 

TTTGAAATTT GCATGCTGAT TTCAAAACCA ATTGAAACAT TGCAAACATT ACCTTTTTTT 120300 

TACGTTGTGT CTTTAGTTTT TTATTACTTT TTAAAACAAA AAACTTACAA TATGCTTCAA 12 0360 

ACAGCAACAC TATTAGCATT GATTCATGGA TGGATTAGAT TTGCTTTTAT TGCAAGAAAC 120420 

AATACATTAA TAATAAAAAA TTTAGATTAT ATTAAAGCCA GCGAAGCTAT GGGAGCAAGC 120480 

AAAATTAGAA TAATATTGTA TCATATTTTT CCAGAAGTAT TCTCATCAAT ATCATCTATA 120540 

ATCCCATTAC AAATGGGAAG AAGTCTTACT ACTTTTGAAG TAGTAAGTTT TTTACAAAAA 120600 

CAAGATAAAA ATC TATATCC CAGTCTTGGA GAACTGCTCA ACTATATGCA AATGGGCAAT 120660 

AAATATCTAT GGATATGGAT CAATCCCTTA CTCATATTAA TAGGCATAAA CATAATACTA 12 0720 

GCAATTATAA ATTTTAAGCT AAGAAAAAAA ATGAAACATT TAATATCATC TTAAATAAAA 120780 

AATTAACAAA CTCTTGGAGC AAATTTTTCT AAAAAACAAT TATCACAATT TACATTTCTA 120840 

GAAGTACAAA TTTCTCTTGC ATGCTTATTA ATAGCCATAG AAAATC TATA CTGCTTACAA 120900 

GGCTTTATTC TTC TTTTTAG ATCCAATTCA ATCTTAATAG GAGAACTTTC CAAAGAAAGA 120960 

GCATGTCTTG TAATAACTCT AC T AAAATGA GTATCTACAA TAATTGCGGG TTTATTGTAA 121020 

ACAGATCCAA GAATAACATT TGCCGTTTTT CGACCTACTC CAGGTAGCTT AATAAGATCA 121080 

AAAATATTAT TTGGAATAAC ACCATTAAAT TTTTCTAAAA TATCAATAGA GCAATTCACA 121140 

ATATTTTTAG CCTTTCTTGA ATAAAAACCA GTCTTATAAA TTAATTTTTC AACATCTCTC 121200 

ACATTTGCTC TTGATAAACT TTCAAAATTC TCGTACCTTT CAAAAAGGTA TGGAGAAATT 121260 

TTATTCACCA AATTATCTGT TGTTCTTGCA CTTAAAATAA CCATTATTAA AAGTTCATAA 121320 
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TTGTTTTTAT AATTTAAAAA AGGTTTAACA TCAGGATATC TAAATAAAGT TTCATCAACA 
ATCAAATCAA GATTAATCAT AAAAAAATTA TAAAACATTA TAAACACAAA ACAAAAATAA 
AAAATATACA AAGTAAAGGT ATCTAGACTT TATTGACAAG GATTTTTCAA AATGATATAC 
TCATCATTAG AATTTTAAAT GCACCAATAG CTCAATTGGA TAGAGCAACA GACTTCTAAT 
CTGTAGGTTT TAGGTTCGAG TCCTAATTGG TGCGCTTCAT TCGGGATGTG GCCTAGTGGC 
TAAGGCACCT GCTTTGGGAG CAGGGGATCG TGAGTTCGAA TCCCACCATC CCGAAAAAAT 
ATTAAAAAAG CTAAAAACTT TTGTTTTTAG C TTTTTTGGT TTTTTAACGA TTTATACAAA 
TTAAATCTAA CTGTAAAGTT ACTTAACTTT CTTTAAAGTA TTTACATCTA AAATAACTAG 
ACTTTTAAAC TCATCTTGCA AATAAATAAA ATTTTTTCTC ACAGAAAAGC TAGTAAAAGG 
CATAATTTTA TTCTCTGAAA GAATAAACTC ATCTAAATTT TTAGGAGAAA ATTTGGCCAA 
TCTCCAATCA TTACTACTAT CTTTATCCCT AACAGCTACT AAAATCATTT TAGAATCAAC 
ATAAAGAGAT GAATTTTTAT TAATCTCAAA ATTAGACTCT GATACCACTT TTAAATTTTC 
AAGTTTATCA AGTATCTGAA GCTTAGCTTT TCCTGAATCC ATTTTAATAA CAACCAAATC 
TTTTTCACGT TCATAAATTC CATACCGCTG AATGCCTTGC TGAGTGCTTT CTTTAAGCCT 
AACACCAGTA TTTAAATCAA TAAGTTGAAG AGTTCCTAAA TTTGTAATTG GATCAATAAC 
CTCTAAAAAT ACAGGACTAC TGGAATC TAT AGACATAGTA GTCAAATCTT CATTCAAAGA 
AGTAACTTGG TCTTTAACCT GAGGCTTAGT CTTTTGCAAA TTAACATCTT TATTAACTGT 
CTCCTCTTTT GAATCAATGT CTTTATAAGA AGATTTATCT AACGGTGATA ATTCTCCAAC 
ATTGTTATTA GACTTGAAAA TCTTATCTAA TTTCTCAACC TCAGAAACAG GTTTAAATTC 
TTTTTTGCTA TCTAATTTTT TAACCTCAGG TAATTTTTGA TCTTCTGGCA TCATAAGATT 
TTCATCATTA TTCAAATCGC CTAAGCTTTT CTGTGACTTA CCCTTGGTTA TTTCTTCTTC 
CTTGGCTTTA CTTTTTTCTT TGCTAGAAGC TTTAGAATTT AATTCTCGAT CAAGATCCAA 
GGCTTTACCA TCTTTACTTG CTTTATCATC TTTACTTTTT AAAAGCTTTT CATCACTTTT 
TTTGATTTCA ATTTGCTTTT CAATTTCTCT TTTCTGATTT TCATCACCAG TTTCTTTAAG 
CTGCTCCTGC AAATCTTCCA GGCTCTCTTT TATTTGTAGT TGCTTATCAA CTTTAGGAGA 
ACTTACATCA CCAGGCTTTG GTAAATTCTT TTCCTTGTTA ATTTCGTTAA TATCCTCTTG 
AATTTTCTCT CTAACAGTAT TTCTTTGAAC ATCTAAATTA TCTTCAGCAG AATCTAATTT 
TTGC TGAGCT TTATCAAGAT TTATTGCCTT TTTATC TAGC TCTTCCTTTT GTTTCTTTTT 
AGCATCAACC TGACTTTCAA TCTCTTTTTT ATGCTCTTCA TCTGTAGCTT TTTCAAGCTG 
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ATCCCTTAAA TTTTCAATAG ^VFCTGTTAT ATTGGAATCA CTTTCATGAA ^PPFtGTCTAA 12 3120 

TTCAATATCA ATTTTATCTT GATCTGCCTT ATGAGTTTCG CCTTGAATAT CTGTAATATC 123180 

TCTTGCAAAG TTAACACCTG CTTCATTTTC ACTTAAAAGA GCTGCCACCA CCTTATCTGT 12 3240 

AACTAAACTG TCAATATCAA TGTCAGACTC AATATTTCCA GACAAAATAT CCTTTTTAAG 123300 

AGGAATAAAT ATTTGTGTCT TTCCAGCCCA CTGACTATAA ACCCTAGAAA GACCTGCATT 1233 60 

TTCTTTACTT AAAGAC TTTA AAGCAGCCTC AATATAAAAC CCTTTATAAT AATCCAAATC 12 3420 

TCCTCTATAA ACAGCATTAT ATATTGTAAT AACCTTAGCA ATTAATTCTG C AC TAGACCT 12 3480 

GTCATAATCG AAAGACTTTA TTAAATACCC TGTAAGAATT CTTCTTAAAT TCAATATACT 12 3540 

GTCAAGCTCT GACTTACTAC CAATAGAAAA AACATCAACG CTTGCTTTTT TATCTTGATC 123600 

ATCAATAAAT CTATTAATAA AATATTTACC ATAATAACTT GAGTTGCTAT TGGAATTGGT 123660 

CAACGGTCTT GCTAAAAACT CCCCAATACC CACTATTTGT TCATATGTAT TTGTAGAATC 123720 

ATAAGGGCCT TTATAATTTA CAAACTCAAG ATCCATATTA ACAAAGTCCT TTAATTTTTC 1237 80 

CCTATCAACT TCTCTTGCAC TAACAGGAAA TCCATTCAAG AAAATAAGAA AAAAACTAAA 123840 

GATTAGTAAC ATTTTTTTCA TAAAAGAAAT TCTCCTATAA ATTTAATTAT AATCTACCTT 12 3900 

ACCAACTAAA TTCACCAATT TAAACCTAAT TTTAACACAA TCGGATTTAT ATGCAAAACA 12 3960 

GCTAATTCAA TCTTGGGGAC TTGAAATATC TTGAATGCTT GAAATAATTT CATTAATGTG 124020 

CTTGTTAATA TTTTCAGAAA AATTTTGACG CCTCATAAAC ATTAATAACT TTATATACTT 124080 

ATGCTTACAA AGTTTAAGAA GTTCGGCTGT TGAATTTAAA ATTATATTTC TCAAATTTAA 12 4140 

AATCACAGAA TAAGCTATTT CCAAATTACC CTGATAATTT AACACTATAT TTTGGATGTA 12 4200 

ATTGTATATA ATACAATCTT GCCCATAATA ACTTTCAAAT TTAGTAATAG TTTTGACTAT 1242 60 

GGTAATTATA TTATTAATAT CTCTCTCTCT AATTGCCAAA TTAATAAAAC TCTTAATTGC 124320 

AGCTTTATCA TTAACATATA CAAAATATCT ATAAAGCTTG CCTCTTGAAG CAATTCTATC 1243 80 

AAACCTCTTT TCTGACAAAA GCCACACATT AAAAGCTTTT TCATATAAAT TTAAATATTT 12 4440 

TGCGGGAAAA TTTCCATGTT CAGCTGCAAG AAAATCTAAA ATCTCTAAAA ATTTATTATA 124500 

AGCAGCATCA CTATAATTAG AATTAAGGAT CTCTTCAAGA AGTAAATAAT TTTTATCATA 124560 

ATCTACAAAA CGCTTTTCTT TTAAATCTAT TCTGCCTTGA TTATATTTAT TACCTCCAAA 124620 

ATGGTTCCAA ACGCTAACAT CAAGCTCTCT GTCGTATTCG TAATAATAAG TACTAAGAAC 124680 

CCAATCATCT CCCCCTAAGA CAATAATACC TGAATTAGCA CTAAGTCCTG ACAAGCCCTT 124740 

AAGTTTAGCA TCTGAAAAAG ATATATCAAA ATCCTCTTCA AAAGACAAAA TTCCATTTGA 124800 

TTTGGTTGAA ATTAAAAGAT TTCTATTATT ATAAAGAACA GCTTTCTCAA TCTCAAAGTT 124860 
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CATTTTTTTT TTAAATTTAA GCTTAAAGTT TCTATTCAAA TCAAAAAACA TACCAACATT 
TGAAATTGCG ACATATTCAC CATTAAACTT TTCAAACAAA AAATTAATAG GAAAATCCAA 
CTTAATAGAG CTAATAACTT CCCCAAAATC TCTTCCATAA GCAACAATAT GCCCTGACTT 
ATGTCCAACA ACTACCTCTT TTTTTGTATT TATCATTAAC AAAAAAGGGA AAGCAACTAA 
TCTGAAAAAC CATTTCTTAT TACCCAAAGA ATCAAACAAA AATATTTCAT CATTTTCACT 
TGCAATACAA AAATCCCCAT TATCAAAAAC TACAGGAGAA GTAGCAGGCC TACCACCTAT 
GTCAACCTCA AACATTTTTT TACCGCTATT TAAATCAATA .GAAACAACCT TTTCATTAGC 
AAGAGGAATT AGGATGTTAA CATTTCCTAT TGCAGGAGAG CTCAAAGGTG AAAAATCAAG 
CTTATACTTC CAAACGAGTT TTCCTCTTCT GATCTTTTGA ACTTCATTTC TAACTGTAAT 
AACATAATAC CCATTATCAA AATCTTTCAA AAGAAAAGGA TATGGCATTC TATTTAATCT 
ATAAGAATAT TTCTTCTCAA ATGACATTGT ATAAGTAGTC AACCATCTAT CTTTTGTTAA 
AACTGTAATA GTGTCACGTT TTTCATCAAT AATTGGATTG CCTGCAACTT TGCCAGTTAA 
TGCTTTTTGA AAATATAAAT TAATATCAGA ATAAAGCCTT AAAAAAGAAG CTGAAAACAC 
AAATATGAAA AGTAGACCTC TCAAAATAAA AAACCTTTTG AGTTTC T AAA AAACTGCTAC 
TAAAGCC TAA AACCAGCATT ATTGCCATAA AGATTATTTC TCAGATCCCT AATCTTAGCA 
TCATCAACAT ACTCAGAAAA AGTCATATAT CGATCAATTA TTCCGTTAGG AGTAAACTCT 
ATAATCCTAT TAGCAACAGT ATCTATAAAT TGATGATCAT GTGATGTAAA AAGAACAACT 
CCTTTAAACT CTTTAAGCCC GGAATTTAAA GATGTAATTG CCTCAAGATC TAAGTGATTT 
GTGGGTTGGT CCAGTATTAA AACATTAGCT CCGCTAAGCA TAGCCTTAGC AAGCATGCAT 
CTTACTTTTT CTCCCCCTGA GAGAACATTT ACCTTTTTTA AAGCTTCATC TTGGCTGAAA 
AGCATTCGAC CTAAAAATCC TCTAATATAA GTTTCATCTT GTTCTTTTGA ATACTGACGT 
AACCAATCGA CTAAATTTAA ATCTAAATCA AAATATTTTC CATTATCTTT ATTAAAATAC 
GAAAAATTAA CGGTAGATCC CCATTCATAA TGACCTTTAT AATTTCTATC TTCATTTGTA 
ATAATATCAA ACAAAAAAGT TGCAAACATG GGATTTCCCA AAAAAACAAT CTTTTGCTGA 
GGTTCAACAA TAATACTAAA TTTATTTAAA ATTAAATTCC CTTCAAATTC TTTTATTAAA 
TTTTTAATTG TAAGAACATT CTTGCCAAGT TCTCTTTCGC TTTTGAAATT AACATAAGGG 
AACTTCCTTG AAGAAGGCTT TAAATCTTCA ACCTTTATTT TTTCAATCAA CTTTTTCCTT 
GATGTTGCTT GCTTAGACTT AGATGCATTA CTAGAAAATC TTTGAATAAA TGTCTTAAGT 
TCAGCAATTT TATCTTCAGA TCGCTTTTTA GCATCTTTTA GTTGCTTGTT TAAAATCTGA 
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CTTGTTTCAT ACCAAAAATC^WAATTTCCA AGATACACTT GAATCTTGCC^IPkATCAATG 12 6660 

TCAACAATAT GAGTACAAAC TTGATTTAAA AAATGTCTAT CGTGAGATAC AACAATAACT 126720 

GTATTTTCAA AATTAATTAA AAACTCTTCT AACCATTTAA TAGATTGTAG ATCAAGGTTA 12 67 80 

TTAGTAGGCT CATCAAGAAG TAATACATCG GGATCACCAA AAAGTGCTTG AGCCAAAAGA 12 6840 

ACCCTAACTT TTAAAGCCCC TTCAACATCA CCCATTAAAT TATTATGAAT TGCCTCATCT 126900 

ATTCCAAGAC CTTTAAGAAG AACCGCTGCA TCAGATTCAG CCTCGTATCC TCCAAGCTCT 12 6960 

GAAAATTCTG CTTCAAGCTC TCCAGCTCTA ATTCCATCCT CATCAGTAAA ATCAAGCTTA 127020 

CTATAAATTT CATCTTTTTC TTTTTGAACA GAATAAAGTC TTTTGTGACC CATAATAACA 127080 

GTATCGATAA CCTTATATCC ATCATAAGCA AATTGATCTT GTTCAAGAGC TGCTACTCTT 127140 

TGATTTTTGG GGATAGATAT TTCACCCTTA CTAGCTTCAA TCATTCCCCC TAATACTTTT 127200 

AAAAAAGTGC TTTTTCCTGC CCCATTAGCA CCAATTATTC CATAGCAATT TCCAGGAGAA 1272 60 

AATTTAATAT TTACATCTTT GAATAAAACT CTCTCTCCAA ATGCAACTTC CAAATTACTT 127320 

ACAGTTATCA AACCATTACC CTGCCTAAAT TGATATTCTA ATAACAAAAT TATCTTGAAA 127380 

ATTAATTTAA TTTTCAAGCA CCATATAAAT ATATTGACTC AACTCTCAGT TTTTTCGTAT 127440 

ATTTAATATT ATTATATAAG GAGATGTTTG AGATGAAAAA TATTAAGCCG TTAGCTGATA 127500 

GAGTTTTAAT AAAAATCAAA GAAGCTGAGA GTAAAACAAT CTCAGGACTT TACATACCAG 127560 

AAAATGCAAA AGAAAAAACA AATATTGGGA CAGTTATAGC TGTTGGTTCT AACAAAGAAG 12 7 620 

AGATCACTGT AAAAGTTGGT GATACTGTGC TTTATGAAAA ATACGCAGGA GCTGCTGTAA 127 680 

AAATCGAGAA TAAAGAACAT TTAATACTAA AAGCAAAAGA AATAGTTGCA ATAATAGAAG 127 740 

AGTAAAAAGC TAAGTTTAGC TACTTAGCTT TAATTTTTAT TAAATATTTA ATAAAAATTA 127 800 

CAAATTTATA CATAAAAACT TATTATTCTG ATCAATCAAA TTAAAAATTT CAAGCTTACA 12 78 60 

AAATTCTGTA AGC TTGAAAA AATAAAATTA AATGAAAAAG CCAATTTTTA AAGAAAATAC 127 920 

CATATATTCA AGCAAATTCG ATGACATCTA TTACAATCCA AAGCAGGGAA TTGAAGAGAG 127980 

TTTTTATACA TTTATTAAAG GTTGCAATTT AGATTTAGAA TTAAAAACAA AAAAAAATAT 128040 

TTTAATAGCA GAGTTGGGAT TTGGAACAGG ATTAAACTTT ATATGTCTTT TAAAATTCAT 12 8100 

AAAAGAAAAC AACATAACCT CAAAAATTAA TTATTATTCT ATAGAAAAAT TTCCACTCGA 12 8160 
AAAAAAAACA ATAATGCAAA TTTCAAAGTT CTTTGCTAAA GAAACCGCTT ATTTTAAATT" - 12 8220 

AATGTTGAAA AATTATTCTA AAATTCCAAA AAAAAATTTA AAACTAAAAA TAACAGAA^A 12 82 80 

TGTTAATTTA AAAATTTTAA TTGGAGACGC CAAAATAAAA ATCAAAGAAA TTCCTGAAAA 12 8340 

TGTAGAATAC TGGTTTTTAG ACGGATTTAA TCCCAAAAAA AATCCTGAAA TGTGGAGCAA 12 8400 
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TGAAATATTT AATTTAATTT CTGAGAAAAG CAGTCCGAAA TGCAAGCTTT CAACATTTTC 
CTCTGCAAGA ATTGTAAAAG ATGGCCTAAA AC TTGCTAAT TTTAAATACA TTCACATAGA 
AAAAGGATTT GGAAATAAAA GACATATGAT AAAAGCTCAA AAAAATTAAA AATTTATTTT 
TAACATAAGT CGTTAAAAAA ATCCCAACAA GTATGATATA CTTCCAAATG GCACAAGGAG 
AATTTTAATG ACAAAAAAAT TGTTTGTGAG GGTATTAATC TTTTTAATAT CCAATAATTA 
TGCTTTTGCA AAAGACACAA TCAAAGATTT GTTCTTTATA CAAGATATAC TAATAAAAAA 
AGAGAAATAT TCCGAGGTTC TAAATAATGC AAGCCTTGAA GGCATTATTG AAATTGAACA 
TAACGGACCA TACATTAAAG ATCACGATTC AGAAGTTAAA CTTATCCTAA AAGAAAACGG 
ATATAGAAGA AATTTCAACT TTTTTAATCT TTTAAATACT AGTAATATAA TCAAAAGTCT 
AAGCTTATTT GACAGCAGAC CAAAAAACAT TAAAGAAAAT GAAATCATAT TATTAGAGAC 
AAAAATGATT AAAGAAAATC CCTATAAACG ATACAAAGAC GATGATGATT TTGAATTAAA 
ACTAAGTGTA ACTCGAAAAA ATAATCAAAT TTATTTAATT CTTGATTTCA ATTTCCTATT 
TGATCAAAGA AAAACGTTTC CATCAATTTA CATCAAAGAA GAAGATGTAT CAACAATAAT 
AAACAGCTTC ATGAAACTAC AAGATTCAAG CTTTTTATCT CCTCAAGCTT CTTAACAATT 
AATAGCACAA AATGTGCTAT TTCTAATAAA AAGCAAGCAT TTTACTGAAA AGCTAACCAT 
AGC C AATTTC ATTACATAAT AATTTTTCAA TCTTTTTACA GATTTTTTAA ATTAATAATA 
TAATTATTTA TTTTATTAAT TAAAGAAGAA AATTCTACAA ATTTTAATTT TTCAGATTCA 
ACAATTTCCT TGGGAGCATT CATTAAAAAA TTTTCATTTT CAAGTTTCTT TGAAACAGAA 
ATATTGAGCA TTTTATACTT TTCAAGCTGC TTTTCAAGCC TTATCAACTC TTTGGTTTTA 
TCTATCAATG ACTTAACATC TGCATAAATT TCAAAACCAA CTGCAGCTAC ACCAAGCATG 
CCATCATAAT TTTCATTGTA AAATATATTT TTAAAATTAA TCATTCTTTT TACAATGCTT 
TCATTAGCCT TAAAGTATGC CTCATATTTA AAATCAGCAT CAAACTTCAA AGCAACATCA 
ATTTCAACAC TAGCAGGTAT ATTAAATTCA CTC TTAAGTG TTCTAATAGC TATAATAAAA 
GTTTTCAATA CTTTAAAAAT TTCAAATTCT TCTTGAAAAT TATTGGCAAT ATCAAAATTT 
GGATATTCAT TTAAAGCTAA AATATCTTCC TTTTCTGCAA ATTCAGAATA AATTTTTTCT 
GTAACAAAAG GAATAAACGG ATGCAAAATT AACAATGATT TTTTAAGAAA AAATAGCAAC 
TTAGAAATAG CCATATTTTG AATATCAACA TTTTCATTAT TTAAATCAAT TTTGCTAATT 
TCAATATACC AATCACAAAA ATCATTCCAA AAAAACTCAT AAACAAATTT TGAAGCTTCG 
TTATATTTAT AATTTGCAAA AGAAGACTCT ACACCAAGAA TAGTCGAATT TAAGCTTGTA 
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GCTATTCCTG 
TACCTTACAA 
GTAACATTAG 
AAAACATTGT 
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TGTCAATGT<^JFrAAATTTC AAATCATTTA 
TAAATTTGGA AGCATTAAAA ACTTTGTTTG 
CGTCAATATT TAAATCTTGA CCCTGAACAG 
CACTTCCATA CTCATTAATA ATATCAAGAG 
TTTTTTTACC TTGTTTGTCA CGCAAAAGAG 
GCCCTGTAAA TTCTAATCCT GCCATCACCA 
AAGCTGTTAT CAAGGTATTT GTTGGATAAT 
ATCCAAGCGA AGAAAAGGGC CATAGCCAAG 
CTTGAACAAA CCTCTTCCCC ATATTCTTTT 
TAAGTTCAGA TGTATCAACA TTGTACCAAA 
TTGATATACA CCAATCTCTA ATATTTGATA 
TAGGATAAAA TTTTAATTCG CCATTCTCTA 
TCATTCTCAC AAACCACTGA GTAGACAAAT 
AATGCCCAAC CTGTTGTTTA TGCTTCTTAA 
CTGTTTCAAT TTTAAATCTT GCATCTTTCG 
TTTTATTAAG TTTTCCATCT TGAGTTAAAA 
AAATTTCAAA ATCATTAGGA TCGTGTGCAG 
CGCTGTCAAC ATAAAAATCT GCAATAACTT 
CTTCTTTGCC AACTAAAGAC TTATATCTCT 
CCCCAAACAT TGTCTCAGGC CTAGTTGTTG 
AATACTTAAC AAAATAAAGC TTACCATCAA 
CAACACTCCC AGATCCAGGA TCAAGATTAA 
TAAAATACAA GTCCTTAAAA ACCTTGTTAA 
ACCTTTCTCT TGAGTGATCA TAAGACGCCC 
CTCTATGCCT ATCTTTTAAT TTAAAAATTT 
CTTTGCTTTT ACCAATCTTT TTAAGATGTC 
CATGATCTGT GCCAAAAAGC CACAAAGTAT- 
GAACATCTTG GAAAACAAAA TTAAGAG£AT 
GAGGAGGCGC AACCATACTA AATTTTTCAA 
TTTTAAGCCA CTTAGTGTAA ATTTCATCTT 
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ATATTTTTCT^fFTTTTTAAA 130200 

CAAATTTAGC CCCAAACATA 1302 60 

ACAAAAAGGA TAAAGTAAAC 130320 

GGTCTATTCC ATTGCCTAAA 13 03 80 

GTGTTATATA AACATCTTTG 130440 

TTCTEGCAAC CCAAAAAAAT 130500 

AATTTTTAAA ATCAACATCA 130560 

AAGAAAACCA AGTATCAAGA 130620 

CATCTAAAGA AGGATCAGTA 130680 

CCGGTATTCT ATGTCCCCAA 13 0740 

ACCAATATTT ATATGTATTT 130800 

AAGCCTTTAA AGCTTTGTCT 13 0860 

AAGGTTCAAT AACCTCACCT 130920 

CATCTTGCAA AAAACCCTTT 13 0980 

CACTTAATCC TTGGTATTGC 131040 

TATTGACCTT AGAAATATTG 131100 

GAGTAACTTT TAAAGCCCCA 13116 0 

TTATCTTTTT AGTTGTCAAA 13122 0 

CATCATTAGG ATTAACAGCA 1312 80 

CAACCTCAAT AAAAGAAGAG 131340 

CTTCTTTGTA TTCAATCTCT 131400 

CAAGATACTC ACCCCTATAA 131460 

CAGCCTTACA AAGATTCTCA 13152 0 

CAAGTTTGTT TATCTGATTA 13158 0 

CTTGAACAAG CTCTTCTCTT 131640 

TTTCAAAAAC AGCCTGCGTT 131700 

TGTGTCTTTT CATTCTTTTA 1317 60 

GCCCCATGTG CAACACGCCA 13182 0 

ATAAAGAATT ATC TGGC AAA 131880 

CAAATGCCTT AGGATCATAT 131940 
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TTTTCAAGAG GTCTACAATT CATCGTTTAA AAATCTCCTA. CCATGTCAAT AACTTTTAAA 
TTTTCAAAAA GCTCTTCATA AGTTTTTGAA TTTATAAGAT TTGAGAAATA CTTACTCTGT 
CTTTCCTTCT CAGTAAAATA TATCTTAAAA TATTTTTTAA GTTTATTAAA ATCTTTTTCA 
AGTCCCCAAG TAGC ATGAAA ATCCTTTATA TGGAATTTTA ATATATTTAA CCTAAAATTT 
AAATTACTAT CTAAAAAGTG TTTTGAAAAG GGCTTAAATA AATTCAAATT ATCAAAAATT 
CCACGACCAA ACATAATTCC ATCAATTAAA TATTTATCAA CATAATATCT AGCTT.CCTTG 

r 

AGACTCAAAA CATCCCCATT TCCAATAATC AGAGTAGAAG GACTAAGATT ATTTCGTAAT 
TTAACAAGTT CATAAAAAAT ATCAAAATTA ACAGGACCTT TGCTCTGATT AACAGCAAGT 
CTTGGATAAA CCGTTAACAT ATCAATTTCA AGGCCTAACA AAAACCCCAG CCAATCATCA 
ACTTCTGGAT ATGAAAATCC ATGCCTGGTC TTAACACTAA GAGGCAAATT AAATCTCGCA 
CAAGCTTCTT TGCTTGCAAG AATTATC TCT TTAGCTAAAG ATTTATTATT AATTAAAGCT 
GAACAAACTC CTTTTTTAAT TATTTTACTC TTAGGACAAC CCATATTAAT ATCAATTCCC 
CAAAACCCCA TGCTCCCTAA TATTTCTATT GCTCTATAAA ATTGCTCAGG AACATTGCCC 
CAAATCTGAG CAATTAAAGG CCTATTAAGT TCATTGGGTT TTAAAAAAAC ATGTTGAACA 
GATTGTTTTG ATCCATTTAA AATTCCTTTT GTAGAAATAA ATTCGGTAAA ATAAATATGA 
GGTTCTCCTT CTGCAGATCC TATTAAATGA ATTAAATTTC TAAAAACAGT ATCGGTAACA 
TCTTCCATTG GAGCTAAAAT CATAATTGGA AGAGGAATAT CAAATAAAAA CTTCATAAAA 
TAACTACATA ACAATAAAAT ACAAATTTAA ACAAAGTAAA ACAGTCGTTT AATAAATTTA 
AATTTCCAAA AATAACATTG AAATTTTTAA AATAAAGAAA TACCACAAAA GGACCGGCAA 
TAAAGGACAT AATCATAAAA TTTTAAAAAG TTAATAAATA ATTTTAATAA AAAAATAATT 
TAAAAAATCA ATCCATAATA AGGCTCTCGC CTGGAACCTT GGTTTTTCTT ATTAAAGTAT 
C TTTATATCC AGCAGATTTA AT AAGTTC T A TATTTTTTTG AACATCATCG GCATTAGTAG 
GAATAAAAAC GGTATAAAAC GGCCCATGAG AATTTACTAC AACAAAAAGA CCCGCTTTTT 
TTAATATTCG ATAAGCCCTA TCAGCGTAAT CTTTTTTCCT ATAAGATCCA ACTTGTATGT 
AAAAATCTGT TTCTTTGTCA GCAGAAACTG AGTAATCTGC CAATAAATCT TTGGATTTAT 
TTTTAACAGA AATATTTTCG TCTGATTGCT TGGTTTTCTT TTCTTCTTTA ACCCCAGGAA 13 3 500 
TATTAAATGT TTTTTTAAAG TCACTAGATT TTGATACAGA AGGTCTTTTT TCATTCGAAC 133560 
TTTCAATCAC TTCAATTTTT ACAGGAGCAA CCCCTATTCC TAAAAAATCA AGCTTCTCAG 133 620 
CGGCATATTT TGACAAATCG ATTATTCTAT CCTTCCTAAA AGGACCTCTA TCATTAATTC 133 680 
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TTACAACAAC TGATCTATTA^WTAAAAGAT TTGTAACCTT TACGGTAGTA^^AAAGGGCA 133740 
ATTC TTTGTG AGCAGCAGTA AGCGCCATCA TATCAAATTT TTCGCCATTA GCAGTAGTTT 133800 
TGCCGTGAAA AGCTTCGCCA TACCATGAAG CAAGACCCAC TGTGGCAGAA TTTAAATGAG 13 38 60 



CAATTAAATT TCTCATAATG TTTATTATAA TATAAAAACA TATTTCAATA AACAATTAAG 133980 

CTTGCAAATT GCTTATTTAC ATTTTTTTTG ATTTAATTAT AAAAAGAAAA AAGTCTAAAA 134040 

AATGATATCA ACAGAAATAA TTAGCAGCAG CCAAATACAA AAAGCAGCAA AACTTATCAA 134100 

AATGGGAGAA CTTGTAGTAT TCCCAACAGA AACAGTTTAC GGAATTGGCG CAAATGCTTA 134160 

CAATGAAGAT GCTGTAAAAA TGATTTTTTT AGTAAAAAAA AGGCCCATCA ACAATCCTTT 134220 

AATAGTACAT GTTGATACGG TAAAAAAAAT AAAAGAATTA TCAGAATATA TTCCCAAAAG 134280 

TGCCCTCATG CTAATCAAAA AATTTAGTCC AGGCCCTTTA ACTTATGTTC TTAAAAAATC 134340 

AATAAAAATA TCTAGATTTG TAAGTGGAAA CCTAGACACA GTGGCAATAA GAATTCCTGC 13 4400 

AAATAAAACA GCTTTAAGCC TAATAAAAGC ATCTAAAGTC CCCATAGTAG CACCGTCTGC 134460 

AAACATATCA AAAAGACCAA GCTCAACAAA TTTCGAAATG GCCTTAAAAG AATTAAATGG 13 4520 

ACTTGTAAGA GGAATAATAA AACCGGAAGA GAACAAAGAC TTTAATATTG GAATCGAATC 134580 

AACTGTGGTT GGGTTTGACC TAAAAGATAA CGTACTGATA TTAAGACCAG GCGCAATAAC 134640 

AAAAAAAATG ATAGAAAATG AACTTCAAGG AAAATATACA GTAAATTACG CAGAAACAAA 134700 

AATGGAACTA GAAAAATCAC CTGGAAACAT AATTGAACAT TATAAGCCAA AAATTCCCGT 134760 

TTATTTATTT AAAAGTCAAG ATAACATAAG AAGATACTTA AACAAAGATA CGAAAATACT 134820 

TATCACAAAA GCTACTCTAA AATCCTATTT ATTCAATTTT TTTTGGAATA AAAAAAATAT 134880 

TACAGTATTT AACACTCTTG AAGAATATGC ACAAAACCTT TACAAAGAGT TGGTAAATTC 134940 

TGAAAACAAC TACAAACAAA TACTTAGCGA ATTCTTAAAA GACGAAGAAC TTGGACATTC 13 5000 

AATAAACAAT AGAATCAAAA AAGCTAGTTC AAATAGATTC ATTAACAAAA AATGACGCTA 13 5060 

AATTGTTATT TAAAATAATT CAAAAAGCAT AAATATTCAT TAATAAAATA ATGCTAAAGC 13 5120 

TAAAAGCAAA AGTCTAAAAC ACGCCACACC TCTCCTCCCA ACCCGAGCAA AACCAGCAAA 135180 

TACAACCAAG AAGACTTAAA CTTAAATATT ACAAGTAAAT TTTCGCATAA TCACATAAGA 135240 

AAAATTTCAA TCCTTTGATT AATTGAAATA ATCATGGATC AACATAGTAT ACTCAAGTGG 13 530,0 

TATTTTATCT TCATCA^AAG CAATACCAAG CGCAAAAACC TTTCCGCTGG GAGTCTGAAT 13 53 60 

AACGCTTAAG CTTTTTGATT TACCTTCAAT AAAAATTTCT CCATCTATAA ATTCAAAACT 13 5420 

AAAAATCAAA TCAATTGCAT CTTCCTCGAC ATCCCCATAA TCAAAAGAAG AAATTACTAA 135480 



AAGCAATAAA AAAAAATACA AAGAGAAAAA CAAAGTTTTT ATTATCTCTT AAGATGGCAT 
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AGCACCCCCA TAAGATAAAT CTTTTATTAA GCATTTATGT TTTGCTCCAT TAAACTTGAT 
AAAAGCTTTA TCAGAATCAA TTTTTAGCTT TCTAATAGAA TCTTTATCGA TAATAATCCT 
CTCATGAATT CTCTGATTTT GCCCAAGCTT TAAATCAAGA AGCTTTCCAA CTTTAATAGC 
AATCTCTTCT GGTGCAGGAG ATAAAAATTC TAATGTTAAT AAATTGTATT C TTTATTC AA 
AGAAGAATAA GCAGAAGCAC TCAATAGTTT TACAGACAAA AAAGGGAAAA AAGCTGCACT 
ACTCTTAGAA TCTGAATTTT TCTTAAGTTG AATAGAGCCT AAATTTTTAT TTTTAGCCAA 
AGCGGGCAAT ACTGTATCTT CTTGAAAAAT AAGCTTAAGA GAATCCATAG AAATAGAATA 
AATTACTCCA AAAGCGGTAT AAGAGCCTAT TCTCATCTCA ATAGTATTGC GAAGATTTAA 
AAAACTATTT ATCTCTGTGC TCATTTTAAT CTCTTTACCC CTATACTTAG CCCCATAATC 
TCTTATTTTT CTAGATAAAA GCATAAACCT CTCCTTTCTC CTTTTTTTGA ATATAAAAAT 
ACTAAACAAA GCTTCAAATT GTTTTAAACA GTTTTACACA AATAAAAAAG TTTAAAACAA 
TTAAGCTATA AACACGCTTA GCCAAAAATC AATAAATCTA CCCTTAAAAT CAAAAACATA 
CCATCATCCA CCTCTCACAG CATCTTTCAA TTAACAAAAC TTACAAAATG CTGTTTACAT 
AGTGTAAAAT TATAACAATA ATTTTACACT ATAACAATCA ACCCATAACA TTATTAATTC 
TTATGCACTT ATAGTATACT TTAGTAAAAA GTATGAAATT GAATAGCCCT AATTTGAAAA 
AAATAAATAC GCATAAGCTG CTTATATATT TAACATATTT CGCAGTTAGC TTTTCTATTA 
TCACACTCTC ATTAGCAGTA TCTAAGACTA TAAACATACA AAAAGATAAA AATTTCGGAT 
ATGTAAATCC AGCAGTTCCT TCAAGACTTT TAGATATTAA TGGAAAACAA ATAACTCAAT 
TTATATCTGA TGAGAACAGA GAATTAATGC CTTTGAGAAA AATGCCTGAC AATCTAATTA 
ATACGCTTTT GATACGGGAA GATATTGGTT TTTTTTCTCA TCGAGGTTTT TCCTTGATAG 
GAATATTTAG AGCCGCATTT AATATTGTTC TTGGCAGATA TTTTTCAGGC GGCAGCACAT 
TAACCCAACA ACTTGCAAAG CTTCTCTACA CAAATCAAGC AAGAAGATCT ATTTTGAGAA 
AATTACATGA AATATGGTGG GCAATTCAAC TTGAAAAAAA ACTCTCAAAA TACGAAATAC 
TAGAGAAGTA CCTTAATAAA GTTTATTTTG GAAACGGAAA CTATGGAATA GTTGCAGCAT 
CAAAATTCTT TTTTGGCAAA AGTGTAAATA AAATCAATAC AGCAGAATCA GTAATGATGA 
TAATCCAGCT TCCAAATGCA AAACTTTATT CACCTCTTTA CAATCCAGAA TTTTCAAAAA 
AAATACAACG TGCAGTTTTA AACCAAGTTG TATCAAATGG AATAGTCAAG GCTGAAATTG 
CTGAAAAAGA ATTTAATGAA TACTGGCAAA ATTATGATTG GACTAGAATG GCTGACACAT 
CTGCAATTTC AAACAAAAAA GACCAAGCTC CTTATTTCTC TGAATATATA AGGCAAAAAA 
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TACTAAAATA 


TTTACCAGAT 


^SCGCAAACA 


TATATAAAGA 


TGGGTACTCA 


^^ATATTCAA 


137280 


CCCTTGATCT 


TGAAGCACAA 


AAATATGCAG 


ATAAAGTTAC 


AAACGACATG 


ATTAATAAAG 


137340 


CAAGAACAAT 


GCACAATTTA 


AATAGATCAT 


CTGAAACAAT 


AATCATTAAT 


TCAGAAATTG 


137400 


TCCCTGTAGT 


AGATGCGATA 


TCAGATTTAT 


TGGGAATTAA 


AAATTTAAGA 


ATAAATGGAA 


137460 


GACAATATAA 


AAAACTGAGA 


AAAAGAAAAT 


TTTACGAAGA 


CAATATTGAT 


CTAATTGCAA 


137520 


GTTTTGGAGC 


TATACTTGGA 


ATTGATAAAA 


TAGATAAGGC 


GACAAAAGAA 


TATATTATGA 


137580 


AAAATAAATT 


AACACCGAAA 


CTTATTGCAC 


AGCCTGAAGG 


AGCAATGATA 


GCAATAGATA 


137640 


CAACAAGTGG 


AGCAATAAGA 


GCCATGGTTG 


GGGGAAGTGG 


ACACACTAAA 


GACAATGAAT 


137700 


TTAATCGAGC 


CACACAAGCA 


AAAGTTCAGC 


CTGGAAGTGC 


ATTCAAAGCA 


TTATATTTTG 


137760 


CAGCCGCAAT 


TGATCTAAAA 


AAAATAACAG 


CTGCGACAAT 


GTTTTCAGAC 


TCTCCAGTAG 


137820 


CATTTCTAAA 


TAAAAATGGA 


GAAGTTTATG 


CTCCGGGAAA 


TTATGGCGGC 


AAATGGAGAG 


137880 


GCAACGTTTT 


AACGCGCCAA 


GCATTAGCTT 


TGTCCTTAAA 


TATTCCGGCA 


TTAAGAATAT 


137940 


TAGACCGGCT 


AGGCTTTGAC 


TCTGCAATTA 


GCTACTCCTC 


AAAACTACTA 


GGAATAACAG 


138000 


ATCCAAAAGA 


AATAGAAAAA 


ACGTTTCCAA 


AAGTTTATCC 


ACTAGCGCTA 


GGTGTAATAT 


138060 


CAGTTTCTCC 


AATCCAAATG 


GCAAGAGCCT 


TTGCAATTTT 


AGGAAATAGT 


GGTAGCGAAA 


138120 


TCGAACCTTA 


TGGGATAAGA 


TACATTGAAG 


ACAGAGCTGG 


AAGAATAATA 


ACAAATGAAG 


138180 


AAGCAAGCAT 


ATTGGCTAAA 


ATAAAAAACA 


AAGAACACCA 


AACTCAAATA 


GTATCTCCTC 


138240 


AAACCGCTTA 


CATAATCACA 


GATATGATGA 


AATCAACAAT 


TCAATACGGA 


ACCCTAGCAA 


138300 


ATCAAAGATA 


TACAAATCTC 


AAAAATTTTA 


AATCAGACAT 


TGCTGGAAAA 


TCGGGAACAA 


138360 


CACAAAATTG 


GGCAGACGGA 


TGGGCAATAG 


GATACTCTCC 


TTATATAACA 


ACAGCATTTT 


138420 


GGGTTGGATT 


TGACAAAAAA 


GGATATTCAC 


TGGGAATATC 


TGGAACAGGA 


ACAGGATTGG 


138480 


CAGGGCCTAG 


TTGGGGAGAA 


TTTATGGCAG 


AATATCACAA 


AAACTTACCC 


AAAAAAGTTT 


138540 


TTGTAAAACC 


TGCAGGAATA 


ATTAGCATCC 


CCGTACAAGC 


AGAAACGGGT 


CTACTACCGG 


138600 


AAGAAATTGC 


TGATGAAAAA 


ATAATAAATG 


AACTATTTAT 


TTCCGGCACC 


CAGCCAGTTG 


138660 


AAAAATCAAA 


ATATTATGAA 


AATAAACAAG 


AATTTAAAAA 


TACAATAGAA 


TTTAACATAT 


138720 


ATGGAATTGA 


TGAGATTAAT 


AATAACGATG 


AAATAAATTT 


TGACACTCCT 


GAATTTGAAT 


138780 


ATCTTGATAA 


TAATCTTGAA 


AGCTTTAATA 


ACAATAGTAA 


TAATGATAAT 


AATCTTGAAA 


138840 


GCTTTAACAA 


TAATAACAAT 


GATCTTGAAA 


GCATTAATGA 


TAATGAAGAA 


AATAAAAATG 


138900 


AAGATGAAAT 


AGAAATGAAC 


ATTGAAGAAC 


CCTTAAATGA 


AATAGAAAAT 


AAAAATCCAC 


138960 


AACAAGATCT 


AGTTAATAAC 


AATAATAACC 


AGGAAATGCT 


TATTGAAAAC 


ACCAAAGAAA 


139020 
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TTAAAGACGA 
ATTCAAACAA 
AATTGGATTA 
GAAAAAATAT 
ATCCAATAAT 
TAAAAGAAAT 
CTTTACTTGA 
CAAACGAAGG 
TTAAATTTAT 
TCATAATAAA 
AAAATCCTTG 
AAAGAATACG 
ATTAATGAAA 
AAATATAAAG 
CATCACCTTA 
AGAGAGTTTT 
GGGAACATTA 
GGATCTAGCC 
AATCTAGCCC 
GTATTAAGCA 
ACATTAGAAA 
AAAGAATATA 
GAAAAAGGAT 
ACATCAGCAG 
ATTCTAAAAG 
GCATCTCTCT 
AACTGCATCA 
GAAATGGAGA 
GTAAGAATAA 
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AGTCATTGTT 
CAATGAAAAT 
AAACAATATG 
AGGAGACATT 
AATAGATAAA 
AGGCTATAAA 
AATTGAACTT 
AGAAGCTTAC 
TATCTTAAAA 
GAGGTGTGTT 
AAGGTATTGC 
ACATTACAAT 
CCCACCTTAA 
AAGTGCTTGA 
CAAGAGGGCA 
TCCAATCAGA 
AAAGTTCAAA 
TGGGGC C AAA 
TAATGAATGG 
GCATTAATGT 
CTAAAGCTAA 
AAAAACAAAT 
ATCTTGAATA 
TTGGAC TT AC 
GAGCCAATGA 
TGGCAGCACT 
TTGCTTATTC 
GTAATGGAAA 
TTTGGGGAGG 



AATGAAACAA 
GAAAAAATTA 
TTAATAGATA 
GAAACTCTTA 
AATAAAAACT 
GAAATTGAAG 
GATGAGAATA 
TTAAAAATTT 
ATTAAAAACA 
TATGTTAAAT 
TCCAGAAGTG 
AGAAGGAGAT 
AATTTTTCAA 
TGGGGAAAAG 
AATTGGTAAG 
ACTTGAAAAA 
TGGCAAAAAG 
AGCTCTTTAC 
TTATTTTATT 
TGATGAAACG 
TATGCAATTC 
GGTCATTATA 
TTTCTTCATG 
ACTACTTACT 
GGCTGACAAA 
AATTAGCATA 
TAAAGCAATG 
AAGTGTAAAC 
CATTGGAACA 
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ACATAGAAAC 
ACAACAAAGA 
TCGATCAAAT 
AAAACAGTAT 
TGATAGCAGG 
TAAAGGTAAT 
ATGTTAGAAA 
ATTCTGAAAG 
TGTGTAAAAT 
TACAAAAATC 
CTCAAAACGG 
AGTGTACATT 
AATTTAAGCG 
ATCAATATTA 
GACGTAATAG 
ATATATAATT 
TTTAAAAATG 
AGCTCAATAA 
TCAAACATTG 
CTTTTTATTA 
TTAATAAACA 
ACACTAAAAG 
CATGACTCAA 
CTTTGCTTCA 
AAATCATTAA 
TATGAAAGAA 
GAAAATTTTT 
AGATTTAATG 
GATGTTCAAC 



ACAAAGCACA 
CGTCAACGGA 
AAAAATAAAA 
TATAAAACAT 
ACTTAGAAGA 
CTCAATTGAA 
ATCATTCACA 
CAATATAATA 
AAGAAATAGA 
TTAATGAACT 
CATTAACTGG 
ATAACTATGC 
ATGAAGCAAA 
GTGAAAATAG 
AAGACAATAA 
TTGCAAAGCA 
TAGTTCAAAT 
AAAATTATGC 
ATCCAGACGA 
TTGTCTCAAA 
AATTAAAATT 
ATAGCATGTT 
TAGGTGGAAG 
CAGAAAAAGT 
ACAAAAACGT 
ATGTTCTAAA 
ATCTTCATTT 
AAACAATAAA 
ACTCATTCTT 



AAAGAATTAA 
GAAGATATCC 
AAAAGAATTA 
GGATTAATTT 
TATCAGGCCT 
AACAAAAAAA 
AGAAGCGAGG 
ATAAGATTCC 
AAAATTTAAA 
TGAAAATTTT 
AAAAAGGATA 
TTCAAAACAA 
TTTAATAGAA 
AAAAGTCCTG 
AGAAAATATG 
AATTCATTCT 
AGGAATTGGT 
AAAAAAACAC 
ATCAGAAGAA 
AAGTGGAAAT 
AAATGGCATA 
GGCAATAGAA 
ATTTTCTCCA 
TGCAAAAGAA 
AAAAGACAAT 
TTACAGTAGC 
ACAACAACTT 
CTACAAAACT 
TCAAATGCTT 
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CACCAAGGAA CGGATATAGT 
GAAGATGTAA TATCTGATAA 
CAAATAATAG CATTTTCAAA 
GAGAGACCTT CTGCACTAAT 
CTCTCCCATT ATGAAAATAA 
GACCAAGAAG GAGTTCAGCT 
TTTAAAGATG AAGTAATAGA 
ATTAATTAAT TTTTGAATAT 
AGAGGTAATA . ATGGATAAAA 
TATTCTAATA AGCATAGTTT 
GTTTATAGCG TTAGCAATCG 
AAATTCAGAA ATAACAAACG 
AAGGCTCCTT AAAATGATTA 
AAAACTAACC AATAGTAAAG 
ATTTACAGCA GGTATTGCTG 
AGCCGAAGGA CTACAAGCGG 
CCTTGAAATA TTAAATCAAA 
TATATTTGAA GATTTTGCAG 
AGCTATCATA GGAATAGCCG 
TTTTAAAAAA ATAATATTAA 
AAAACTAACG CCTTATGCTA 
CAAAAGCATA AT AAAGC TTG 
ATTTCTTATG CATATGACAT 
AAAAATATTC CCAGCACTAT 
CATTAATATA GAAATTCAAA 
AAGCTCCTTT GGAACATCAA 
TGCAATAATG ATAGCACCAA 
ACTTATTGGA TTAATAATAA 
AACAGCCTCA CTAATGGTGC 
AATATCTGTT GAGCCTATAA 
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:aatggat 
cagctcaagc 

AGGTAAAGAA 
ATATTCAAAA 
AGTAATGTTT 
AGGAAAAATT 
ATCTTATTCT 
ACCCCCTTAA 
TAAGTATATT 
ATC TTTGTAA 
GAATAGTATT 
AAACTATAAA 
TAATCCCCTT 
ATGTTGGGAA 
C C AT AATTGG 
GAACCATCGA 
CAACAATCAC 
GGCTTAGAAA 
CCCTTAAAAC 
CACTCCAAGA 
TATTAGCTTT 
GAGAATTTGT 
TAATTGCAAT 
CATTTGCATT 
CTAAAAATCT 
TTGGGCAAAA 
CTCAGGGAAT 
TAACTTCATT 
TCTCAGCAAT 
TTGACATGGG 
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TTCATAGGTT TTAATGAAAC 



:^^ac 



^CTTAAA 140820 

AATGATAAAT TAAAAGCAAA TTTAATAGCC 140880 

AATAGCAATA AAAATAAAAA TTTCCAAGGC 140940 

GAATTAACAC CTTATGCAAT AGGAGCAATA 141000 

GAGGGATTTT TATTAAATAT AAACTCATTC 141060 

ATTGCAAATC AAATTTTAAA AAATGACAAT 141120 

AAAAAAATTC TAAAAAAATT TTAAAACAAG 141180 

GTTTAAAAAA GAATGCACTA AGCTTATATA 141240 

ATATACATTA ATCAATATTA TAATAATGCT 141300 

AAGAAAAAAT GTTTCTTTTA CAAAAAGAGT 1413 60 

TGGAATGACC ATTCAATATT TTTATGGAAC 141420 

TTGGATAAGT ATTTTGGGCG ATGGATACGT 141480 

AATAATAACA TCAATAATCT CTGCAATAAT 141540 

AATGAGCCTA CTTGTAATAT TAACACTAGT 141600 

CATTTTCACT QCTTTAGCAT TGGGATTAAC 141660 

AATTTTACAA AGTGAAAAAT TGCAAAAAGG 141720 

AAAAAAAATC ACAGATCTTA TTCCACAAAA 141780 

.AAACTCAACC ATCGGGGTCG TGATATTTTC 141840 

ATCTATCAAA AAGCCAGAAT CAATAGAATT 141900 

CATAATATTA GGTGTAGTAA CTTTGATTTT 1419 60 

AATGACAAAA ATTACAGCAA CCAGCGAAAT 142 020 

AATTGCTTCC TACATTGCCA TAGGTCTTAC 142080 

AAATAAATTA AACCCAATTA CTTTTATAAA 142140 

CATATCTAGG TCGAGTGCTG CAACCATACC 142200 

GGGAGTAAGC GAAGGAATAG CAAATTTATC 1422 60 

TGGTTGTGC A GCACTACACC CCGCTATGCT 142320 

AAACCCCACA GATATTTCAT TTATACTCAC 142380 

TGGAGCTGCT GGCGCTGGTG GAGGCGCAAC 142 440 

GAACTTTCCA GTGGGATTGG TAGGACTTGT 142500 

AAGAACAGCT GTTAATGTAG GCGGCTCAAT 142560 
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GCTTGCAGGC GTTATATCTG CTAAACAGCT CAAACAATTC AACCATAATA TATACAACCA 
AAAAGAGCTT GTAAACAAAT AAATAGGAAA ACAATGATGA AAATAATAAT TATTGGGGGC 
ACATCAGCAG GAACTAGTGC CGCAGCTAAA GCAAACCGCT TAAACAAAAA GC TAG AC ATT 
ACTATCTATG AAAAAACAAA TATTGTATCT TTTGGAACCT GTGGCCTGCC TTACTTTGTG 
GGGGGATTCT TTGACAACCC CAATACAATG ATCTCAAGAA CACAAGAAGA ATTCGAAAAA 
ACTGGAATCT CTGTTAAAAC TAACCACGAA GTTATCAAAG TAGATGCAAA AAACAATACA 
ATTGTAATAA AAAATCAAAA AACAGGAACC ATTTTTAACA ATACTTACGA TCAACTTATG 
ATAGCAACTG GTGCAAAACC TATTATTCCA CCAATCAATA ATATCAATCT AGAAAATTTT 
CATACTCTGA AAAATTTAGA AGACGGTCAA AAAATAAAAA AATTAATGGA TAGAGAAGAG 
ATTAAAAATA TAGTGATAAT TGGTGGTGGA TACATTGGAA TTGAAATGGT AGAAGCAGCA 
AAAAATAAAA GAAAAAATGT AAGATTAATT CAACTAGATA AGCACATACT CATAGATTCC 
TTTGACGAAG AAATAGTCAC AATAATGGAA GAAGAACTAA CAAAAAAGGG GGTTAATCTT 
CATACAAATG AGTTTGTAAA AAGTTTAATA GGAGAAAAAA AGGCAGAAGG AGTAGTAACA 
AACAAAAATA CTTATCAAGC TGACGCTGTT ATACTTGCTA CCGGAATAAA ACCTGACACT 
GAATTTTTAG AAAACCAGCT TAAAACTACT AAAAATGGAG CAATAATTGT AAATGAGTAT 
GGCGAAACTA GCATAAAAAA TATTTTTTCT GCAGGAGATT GTGCAACTAT TTATAATATA 
GTAAGTAAAA AAAATGAATA CATACCCTTG GCAACAACAG CCAACAAACT TGGAAGAATA 
GTTGGTGAAA ATTTAGCTGG GAATCATACA GCATTTAAAG GCACATTGGG CTCAGCTTCA 
ATTAAAATAC TATCTTTAGA AGCTGCAAGA ACAGGACTTA CAGAAAAAGA TGCAAAAAAG 
CTCCAAATAA AATATAAAAC GATTTTTGTA AAGGACAAAA ATCATACAAA TTATTATCCA 
GGCCAAGAAG ATC TTTATAT TAAATTAATT TATGAGGAAA ATACCAAAAT AATCCTTGGG 
GCACAAGCAA TAGGAAAAAA TGGAGCCGTA ATAAGAATTC ATGCTTTATC AATTGCAATC 
TATTCAAAAC TTACAACAAA AGAGCTAGGG ATGATGGATT TCTCATATTC CCCACCCTTC 
TCAAGAACTT GGGATATATT AAATATTGCT GGCAATGCTG CCAAATAGAA AGAATTAAAT 
TAATTTAATT CTTCATGCTA ATTGGTTGCC CCGTACTTGA AAGAACATCT CTCCAAAAAG 
AACCATTTGG ATTAACCTTA TTTCTGTCAA TTACTGCCAT CTTAATAGGT ATATGAACAA 
ATTTTGTACT CCATAAACTA ATCAACATTT TTGTCTTACC AGCCATTGCA GCATGCACAG 
CATTCGACCC AAGCCTAGCA CAATAAAGCG AATC AC TGGC ATTAGCAGGT GAACTTCTAA 
TAATATAGCT GGGATCAATG TATTTAAGAG TAAATTGTAT ATTTTTTGCT TTAAAATATT 
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CTGTAATTTT ATCTTTAATA^BlAAGCCCAA TATCCTCATA AAGCAAATTC^TCAGAATCGT 
CTTTCTTCTT AGGAAAATGA TCAAAATATT TTTGGCCTGC TCCTTCTGCT ATCAATATTA 
CTGCATGGGG AATCTCTTCT AAGCTTTCTT TCTCTAAAAG TCGTCTTTCA AGATGAACAA 
GAAATCCATT AGGACCTTCT ATGTCAAAAT CAAGTTCTGG GATTAAACAA AAATTAACAT 
CATTAGAAGA AAGTGCGGTA TGAGCAGCAA TAAAGCCAGA ATCCCGTCCC ATAACTTTAA 
CAAGTCCAAT GCCATTATAA GCACTATTAG CTTCAAAATG AGCACCAGCA ACAGCTGCAA 
CAGCTTGTTC TACAGCAGTC TCAAATCCAA AAGATTTTTG AACAAACATA AAATCATTGT 
CTACGGTTTT AGGAATGCCC ACAACTGCTA TTTTTAAATT TCTTTTTTCT ATCTCCTCAG 
CAATAAGAAG AGACCCCTTT TGAGTACCAT CCCCGCCAAT GTTAAAAATC ATATTAATGT 
TCATTCTCTC TAAAGTATCA ACTATTTCCA CAGGCTTAAT ACCACCCCTT GAAGAACCAA 
GAATAGTACC TCCAAATTTA TTAATATCAT CAACAACATC TGGATTAAGA TTAATAAAAG 
GTGAATTTGA CTCAGGAAGA AGCCCTTGAT ATCCAAATTT TACTCCATAA ATATTGCGAA 
CCCCATATAT TTTCCATAAA GTTCGCACAA TAGAGCGAAT AACATCGTTA AAACCAGGAC 
AAAGCCCACC ACAAGTAGTA ATAGCAGCTT TAACATGCCT GGGCACAAAA TAAATTTTTT 145140 
CTCTAGGCCC AGCTTTTTCT AAAAGAACAT CTTCATACCT ATCTCCCTTA TCCTCATTCC 145200 
TATATACACT AAACTTGATT TTATTTTTTT CATTAACAAA ATGGGAAGAA CCCTCACTAG 
CATAAAAATC AATCAAAGGA TTGTTTTGCT TGCATTCTCC CAAGCTATCT ATTTTAAAAT 
CTAAATTTTC ATTTTTAATT CTATACACCA AATACTCCTT TATAGAATTA TAACCTAATT 
ATTTTCTAAT AAATCGACTT TGATCTTTAA TCATATCGTA TATGTCATCG TAAATATAAG 
GAGACCCTTC AATAGGAGAT TTAATTAAAT TACCAGCTAT GAATTCAAAA TATTTATTCA 
ACTTTGAATT TTTCTCAAAA TCAATAAATG GAACTCTATT ATTAATAGCC TCTCTGAAAC 
TTTTTGCAAA AGGCACAAAA CCTATAAACT CTATTGGTAT ATTAATATTA TTCTTAACAA 
CATTAATCAA ATTTTCACAC ATAGCAATCT CTTCACTAGT TTC TATTCTA TTTAGCACCA 145 680 
CTCTAGGATA AAAATTATTC ATCATCCTCT TAACTTTCAA AGAGGAACTC AAAGAAATAA 145740 
GTTCAATCCC AAC AAC C AAA TCTTTAAATC CAAGGTTTGT CCCCTCAATC TTATCTTTAA 
AAAAATTACC AATATAATCC CGTTCGGGGC TTTTTTGCGG AAATCCTAAA TATAAAAGAC 
GATAAAGAGC ATTCTTTAAA AAAGAATAAG CATTAAGTAT GGAAGGGGTT TCTGGTATTG 
TAACAATTAC ACCGCTGTAA GATGCCAAAT AAAAATCTAT TGTATTATAA GAAGTTCCAG V:145980 
ATCCCAAATy TAAAAAAATA AAATCAGCAA TAAGATCTTT TTGAATGGAT TCTATAATCT 146040 
TTTTCTTAAT AGAAAAAGGA AGATTAGCTG TTCCCGTATA AAGAGCATCA CCTGGAATAA 146100 
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GATAAAGCTT ATCATAAGAT GTTTTACATA CTAAATC TGA AAAACTTTTA CTCTTTTTAT 146160 

TAATAAAAGA ACCAATGCCC ACACCCTTAT TTTTAACCCC CAAACACGTA TGTAGATTAG 146220 

AGCCACCAAG ATCAAGGTCA ACAAGTATTA CAGTTTTACC CAAACTAGAA AGCTTATAAC 1462 80 

CAACATTTGC AACAAAAGAT GTTTTTCCAA CACCGCCTTT GCCACTTGCC ACAGGAATAA 146340 

TTTTAGTCAT- TCTTAAATCC TAATTATCCT TACGATCTTT TTGAAAAATT TTCATAAAAT 146400 

TGAAAATCCC TAAAAATCTA GATTTTTTCT CAGCATCTTT ATTTAAATTT TCATCCTCTT 146460 

TAGAGCCTGA AATTAAATCT TTAATTAAAT CTTTATTATT TGTAAAATTT TCAATGCTAT 14652 0 

CTGGTTTACC ACAAATTACA ATTTTATCAT CTTTTAAAAA AAAATAATCG CCATCAACAA 1465 80 
ATTCATACCT AGAATTACTT AAATTTCTAA CAGCAATAAC TGTAATCCCA CATTCTCTTC • 146640 

TAAGATCGGG TTCAAAAAGA GTTTTACCAA CATATTCTTT GGGAATAACA GTTTCAGCAA 14 67 00 

CAATAATATC ATACCCAATA ATATTATAAG TTGAAAGATT TGGAGATACT AATAATGGAG 1467 6 0 

TTAATCTTCT TGCAGCATCT TTACTTGGAA ATATAATTTT TGTTGCCCCA AGAGTTTTTA 14682 0 

AGATTTCAGC ATCATCTCTA TTTTCTGTCT TAACGCATAT TTCTTTCAAA CCTAAAAGAT 14688 0 

TACAATAGTG AGTAACAAGA GCACTTTTGC CAAGATCATC ATCAAAATCA ATAACAACAG 146940 

CGTCTGTATC T AC TGGAATT ATTCTTTTCA AAGCATTTTT AGTGAATTGC TCAACAACAA 147000 

AGCTTTCTGT AGATATCACA TCATATTCTT CAATAAGCTC TTTAGATGTA TCTATAATAA 1470 60 

TAATTTGACA ATCAAGCCTG CTTAAATCTT CAAGTAAGTG AATGCCTAAA TTACTAAGTC 14712 0 

CAATAATAAC AAATGTTTTC ATATGCTTCA ACCAACCAAA ATATCTTGCC TTGGCCTTGT 147180 

AAATTCTTCA AAACGCGACT TTCTTGAAAC AAAAACAGCC ATTGAAAAAA GCCCTATTCG 147 240 

TCCTGCAAAC ATAGTAAAAA TTATAATGAC TTTCCCCCAA AATGACAAAT CCTGAGTTAC 147300 

TCCAACTGAA AGACCAACCG TTCCAAAAGC AGAAAATACT TCATAACCTA AATCAATAAC 147360 

CTTCCAATTG CCAGATCCTC CCTCAAAAAA AAGAAGCATG AAAAAAGAAA AACTTAAAAT 147420 

AAAAATAGCT CTTGCAAAAA ATAAAAGTGC AAATCTTATA CTATCTATTG AAACCTTGTA 147480 

AGAACCAATA ATATATCCAT TGCCGTTTTG ATTTTTAACA ACAGCCAATA CAATTAAAAA 147540 

AAATGTTGTA ATCTTAATCC CTCCTGCAGT TGATCCGGGT GCACCACCAA TAAACATGAA 147600 

TGGTAGAGAA ATTATTTGAG TTCTTCCGCT TATTAAAGAA TTATCAAGAT AATTAAAACC 147 660 

AGCTGTTCTG GTACTAATCG AATAAAAAAT TGAATTAAAT ATTAAAGTGC TCATTGAATA 14772 0 

ACCAGCTTTT AATTTATGCA TCTCTGTAAA AAAAAATAAA ATTGCACCAA TT^ATAATTAA 147780 

AAAGAAGCTT AAAGAAAAAA CTATCTTGGC ATGAAGCGAT AGTTTTTTTT TGTTTTTAAT 147840 
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AGTGTTATTT ACATCTCTAT^TOACCATAAA CCCAAGCCCA CCACAAATTA^WaAAATAGA 147900 

GACCACAACT ATAGCTTCAG GAACATCTCG CCATGCATAA ATACTCTCAG AATGCATGGA 147960 

AAAACCTGCA TTGCAAAAAG CAGAAATTGT CGTAAACAAA GCCTCTAAGA ATGAAATATT 148020 

CACTCCCCTA AGTTTAAAAC AAATAAGTAT TAATATTAAA CCTATCATTT CAATTGAAAA 148080 

AGTTATAAAC AATATGCTTT TTAAAATTCT AATAGGATTA TATTCTATAT TTGAAAGGGA 14 8140 

ATACTGCTTT ATTATTGTTG CATC TGTTAA ATTCATTTTC TTTTTAGGTA TAAGCAAATA 148200 

AAAAGTAGTA ATACTTATAA ATCCAAGTCC CCCAAGCTGG ATTAGCAACA TTATCAAAAT 1482 60 

AAATCCAAAA GTAGAAAAGC CTTCCATTTT AACCGTTGTA AGGCCCGTAA TACTTACAGC 148320 

AGAAACAGCA GTAAAAAGAG CATCAATGTA TGCTAATTTG CCATCACCTT CCCAGGAAAT 1483 80 

AGGCAACATC AACAAAAGAG AGCCTATAAA CATAATTAAA ACAAAATAAC TAAAAAGTAA 148440 

AAACCTGTCG CTAAATTCAA ATTTCAACAT ATCATACAAA AAGTTGTTTA AATTATTAAA 14 8500 

AATTTATCTT ATATAGCATA ATATTTTAAC ATTGAAATAT TATCATAATT ACATTATTTT 1485 60 

TAATATATGT TTGAAATAGA ATCAAAAGCA TTTATTCCTA CAAAAGAGTT AAAAAGAATT 148620 

ATCAAGCTAG CAAATAAAAA ATTTAAGTTT ATTAAAGAAG AAATAAAAAC TGACATyTAT 148680 

TACTCAAACC AAAAAAAAAT TATAAGAATA AGAAAATTAA ATACTCTAGA AAAAATTGTC 148740 

ACATTCAAAA AAAAAATATT AGACAACAAC AATACTGTAG AAATTAATAA AGAGATAGAA 1488 00 

TTCAAAATAG ATAGTATTAA TAATTTTTTA ACCCTTATAA AAGAGCTTAA ATTTAAAAAG 1488 60 

CTATACAAAA AGATAAAAAA AAGTTTAATT TATCAAACTA ACAATTTAAA TGTAGAGATA 148920 

AACGAAATAA AAAATCTTGG GTTTTTTTTA GAAATAGAAA AAATAATTAA CAATCAAAAT 148980 

GATATAGACT TGGCAAAAAA AGAAATTGAC AACATAATCA ACCAATTTGG ATTAAAAGAA 149040 

AACATTGAAA CTAGACCTTA CTCTGAATTA CTTTCATTGG CAAATCAAAG TAAAAAATAA 149100 

TTCATTGGAA TTAGAGCTTA AAGTAGAGAT TACAAGCCCT TGATTGCCAT AAATTCCAAT 149160 

CTGAGGGCTT TTAACATTAC TCTTAAAATT CTCAAGCTTA TTTAAAAAAT ACCAATTTTT 149220 

ATTCTTAAAA TAAATTAATC TCACATTATT ATTGTCCTCA AAAGCTAAAA ACAAATTATT 149280 

TTTATAAAGC CCAATGTCAG C AC TTAAACC TTCCATTTCA ACATTAGGAC TTATATTAAT 149340 

CCATCTACTA CTTTTCAAAG GACAAATGTT TACAATAGGT CTATTTTCAG AAACAAAACT 149400 

CATAATTATT TGATTAAAAT TAGAATCAAA AAAGCCTTTA ATAAAATTGG CCATATAAAC 149460 

AGAAGGAATA TTTGCATTTA CCCAAGCATT TTCATTGTTT <ACAATAAATT CAGATTTAAT 149520 

CTCATTATTT GACTTATAAT TATAAAAAAT GCCCAAAAAA GGTTCAGATA TTAAACCAAT 149 5 80 

GTTTGATGAA TTAACATTAG AATCACCTTT ACTTAAATAA GCATGTATTA CATCGGTCCA 149640 
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AATACTTCCG TAACCCATAT TCGAGATTAA ATTAATTTTA TATTCACCCC TAATTTCCCT 
TAAATATGCT AAATACAACC TATCTTTTAA ATCAATGCTA ATATTTAATA AAGATCCAAA 
ATTTTCTATG TGACCAGGAC TAATATCAAT CCATTTTCTA CTATTAAATT TTTTAACTAT 
AAGCTCGCTG GCAAAATCAG CCCCTGATTT CGTAACAAAA GCAATATATA AATTTCCTTT 
AGAATTAATT GAAAAATCAA AATTAACTAT ATTAGTAATA TTTCTATTAA CAGATGAATC 
AAGATTAAAC CAACCAACAT C CTC AAT AAA TTCAGCAACT TTAATATCAT CGCTATTTTC 
TAGCTGATAA GCAATATAAA TATTGCTTTT ATAAATCCTT AATACATATT TTTTAAGCTT 
GGCAGTTAAA TTTAAAACAG GCAAATCTTT TAAAGTAAAA AATAAATCTT CTTTTTCAAT 
CTTAGATGTT TTAGCTTTCA GGGAATCGCT ACTTAAAATT GAAAAATCTA AATCTGTAAG 
CGAAAACTTT ATGTTTTCAT TTTTAGTACC AACATATATT ATTGCATAAA GAGAATTTTT 
ATCTAATCTT ATTTTAAAAT CTCTTCTTTT TACTTTATCG GTTATATATT TTTTATTAGA 
AATGTCATAA ATTTTAAAAA CAAAATCGGA ATTTGAACTC TTATCTAGGG TTAAAATATA 
ATCGGAAGAT TTGCTAACTT TTAAGTAAAC ACTTCCTTTC CCATTTTTGC TTAAAATACT 
TAAAGGACTA ATTTCTGTTA ATATTTGATT TGCTTGAGCT TGAACAAAAG AAAATTTTGT 
AAATAAAAAT AGCAAAATGA ATGTCTTATT TATTTTCATA TTTTTTTACA TTCAAAAATA 
TTAACACATA TTCTAAAAAT GATAAAATTG CAAAAAAAGC AGCACAAACA TATGTCATTT 
GAACAATAAA TAAAAATTTA AATTTAAAAG TTAAAATGTA ACTAATAAAA TTTTGAACAG 
ACTCTGTAAA GTTGAGTTGA TTTAAAGTAT AAAATAAAAG GCTTGCAAAA GTGCAAACAG 
CATAAAGAAG TGACTTTAAT TTCCCCAAAA AATTTGCTTG TTGAACTACA TTAAACTGAA 
TAATTAAATT TCTAACAAAC CCAATAGAAA TTTCACGATA AATAAATATT ACAAAAAAAT 
AATAGGGGGT TATACCTTTG TAAAAGAAAA AAACAAAATA TGTTAAATGC TGCAAAACAT 
CCGCATAAGG ATCTAAAATT TTACCTACAT TGCTAACAAG ACCATATTTT CTTGCAAGAT 
AACCATCAAT AAAATCAGTA AATTCATTAA AAATAATTAA AAACCAAATA ATTCCAAAAA 
ACAAATACGA AAAAAATACA TTTTCCAAAA AAAATAAAAT TAATATGATA AAGGAAAGTG 
CAATTCTAAC TAATGTTATT TTATTAGGGG TAATGACCTT GATTAAATTA TTCAATTTAT 
CAAATCTCCT TATCTCTTAT TTTAAATAAA ATAAATTTAA GAGCTTCATC AAGTTTCATT 
CCATTTATTT GCTCATTTGT TCTTGTTCTA ATAGATATTC TCTCTTCTGT TGCTTCTCTC 
TCACCAATTA TAAACATATA AGGTATTTTT TTAGCCTGAT ATTCTCTAAT TTTAGCATTC 
ATTCTTGAGG AACTATTATC AAGCTTTATT CTAATCCCCT CATTTTTAAA TTTATTAAAA 
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ACCTTAATAG CATAATCTTC ^^AATATTG TTAACAGGAA 
GATAACCATA AAGGAAAAGC ACCACCATAG TGCTCTACAA 
ATAGATCCCA ACAAAGCTCT ATGAATCATA AATGGTCTTT 
GTATAAGTCA TATTAAATCT CTCAGGGAGA TTAAAATCAA 
CACTCTCTCT CAAGCGAATC AACTATCTTA AGATCAATTT 
CCACCCTTAT CAATTTCATA AGGAACTTCA AAATCGCTTA 
AAAGACATTT CCCAATCAGA ATCATTGCCA ACAGATTTGT 
GCCTTTGGGT TGC TAAAGCC AAATTTACTC CACATATAAA 
TTAATCTCAT CTAAAACCTG AGAATGGGTG CATATAATAT 
CCTCTGGCTC TCATCATACC ATGCAAAGCA CCTATCTTTT 
AGTTCGGCCC ATC TAAATGG CAAATCTCTA TAAGAATGCT 
ATATGAAAAG GACAATTCAT GGGTTTAAGA TAATAATCAC 
TCAAACATGC TATCCTTATA AAAGTCTAAA TGACCAGAAG 
CCAATATGAG GAGTAAAAAG AATATCATAC CCATTTTTGG 
TCTTCTATTA AAGCTCTTAT TTTGGCACCA TTGGGATGAA 
ATCTCTTCAT GTATAGAAAA TAAATCAAGC TCTTTTCCAA 
TTTATTTCCT CTCTCAAATT AAGATAAGAT CTCAGTTCTT 
CCATAAATTC TGGTAAGCAT TGGGTTTTTT TCACTGCCCC 
CTAGTAAGCT TAAATGCCTT TGGATCAATT TTATTCATAT 
CAAAGATCAA CAAAATTGTG ACTCTTGTAA ATAGAAACTT 
TTAATCAAAT CAATCTTATA AGGTTCATCT TTAAAAATTT 
ATTATCTCTT TTTCAAAAGA ACTTCCGGTC TTTAAAATTT 
TC T AAAAG AG AATCTTCTGT AATCTGCTTT TTAAATTCAA 
TTAATAGGAG GAC CTATTGC AATCTTGGTA TTTGGAAATA 
ATAACATGAG CTATTGAGTG TCTTTTTTTG TAAAGAATAT 
CTCACAACAA TACCTTTTGC CTTTCGCTTT TTATTAAAAA 
CTTTTACGTA AAAATACGCA CCTCAAATAT TTATAATTAC 
AATTTTCTAA AAAAATAGAG\ATAAGAAAAC AAAAACCTGA 
GCAACTATTG ATTCAATATT AAAATAAAAA GACATTGCTA 
GAACCTCCAT AAGAGAGAAA AGGAAAGGGA ATCCCGGTAA 
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r^^AAC 



TGATTACTAC^^RAACAGGA 151440 

GAATTCCAAA AAATCTTTCA 151500 

TTTCTTTACC ATCCTCAGCG 151560 

ATTGAATTGT ACTCATCTGC 151620 

TAGGCCCATA AAAAGCACCT 151680 

AAGTCTCTTC AAGAACTTTT 151740 

CAGGCTTTGT AGAAAGATAT 151800 

TAGCAAACCT AAGAACTTCT 151860 

GAGCATCATC CTGAGTAAAC 151920 

CATAACGATA CACAGTGCCA 151980 

TACCTGTATT GTAAATTGCA 152040 

TTTTATCCAT TTCTATTTTT 152100 

TTTGCCAAAG CCAAGATTTG 152160 

AGTGCTCTTC TCTCCAAAAA 152220 

AAAAAACAAG TCCTGGTCCA 1522 80 

GCTTTCTATG ATCTCTTTTT 152340 

TTTCATTATT CCATAAAGTT 152400 

GCCAATAAGC CCCAGCAATA 152460 

TCTCAACATG AGGACCTCTA 152520 

CATTTTGTAA ATCAAAATTT 152580 

CAAGAGCCTG TTCTACGCTT 152 640 

CTCTCATTCT ATTTTCTATG 152700 

AATCATAATA AAAACCATCT 1527 60 

AATCAAGAAC AGCTTCTGCC 152820 

CTTCTTTATC TAAATCTTTG 152880 

ATTAAAATTC ACACTCATCA 152940 

TAAATTAAAA TATACAAAAA 153000 

AAATAAATTT TCAATCCATA 153060 

AAAAAAATGT AATAGTAGAA 153120 

TAGGAAGAAC TCCTAAAGAC 153180 
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ATTCCAACAT 
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TAAAAGAAGT ATGAAAAAAT AAAAGTCCCA 
TATCTTGACT TTTATTCATT ATTATCAAAA 
AAATAGTGCT AACACCCAAA AACCCAAACT 
TGCTTTGAGA TGGCACATAA TTAGCGTGGG 
AAAGACCGCC AGAACCAATT GCTATTTTAA 
CATCAATAGC CGGATCTAAG AATACCAAAA 
TTGAAAGAAC CTTTGAAAAC AC T ATTGAAA 
AAAAATAAAT TATTTTAATA CTCAAACCAT 
TCAAAAGAAT TAAAAGCAGC ACTCCCATTA 
TAAGATAAAA TACATTACCC ATATTCACCT 
AAACAAAAGA AAAAAACCCT ATCAACGCAA 
CAAAAAAAGA AATAAATATA AAAATGGTTA 
GCAATAATAT AAGAATTACC GATGGAAAAA 
ATTCATTATA ACCCTTTTTT TCAGTGTAAA 
TACCAAATTC AGAAGGC TGT CCTCCAAGTT 
TTACTGTCAT TCCAAAAAAT GCAGTAAAAA 
GATATACCAT GCTATAAACA AATTTTAAAT 
ATCCAATAAT TACCCAAAAG GTTTGTTTTA 
TATTATAATC GCTAGAATAA ATCAACAATA 
TCAAAGCCAA ATAATCATAA TTTTTTCTAA 
CTATAACCTT TAAGAATATC TTCATAACTT 
TCTGTAGATT TTGCAGGCCA CCAATCCACA 
ATAATTTGAT TATCAGCTGA ACCGTTATAA 
AAACCATCTA TTCCAGTTTG ACCAGTACCT 
AGAACTGCAT ATCTTGCTGT ACCATAAGTT 
TTAAATGTGT TTTTACTAAT AAGATTTGTC 
ACAACCTTAT TAGTACCACC TTTTAAAATT 
CCTTCATTTG CAATCATAGC AACCATATTA 
CCTTGACCTA TTGAAAAATT TACAGTATCT 



AAATTCCAGA 
ATTTAAAAAA 
CTTCGGCAAG 
TATAAGGTCC 
CCTGATTTAA 
ACCGTTTAAT 
CTAATAAAAT 
ATTTAGAAAT 
TTACTCTAAA 
TATATTCATA 
ATGCTAAAAC 
AATATACTAT 
TTAATAAAAA 
ATTTTGAAAG 
TCCATATGCC 
TTAAAGCCAA 
CATATTTGCC 
TATATTCATT 
TACCAACAAA 
AAACCATTAA 
TGATTTGCAA 
TTACTTTTTG 
GGGGCAAGTC 
GTTTTTCCTC 
ATAACACTTC 
TTTCTTAATA 
TTATTTACAA 
ACAATCTGCA 
CCTCCTACCC 



TATTACTAAG 
AAGGAAAAAA 
AATAGAAAAA 
CTTTAAAAAT 
ATTCCAACCA 
CTGATAAGTC 
AGAACTTGCA 
GAAAAATCCT 
ATAAAAAGGA 
CCAAACCGGT 
ATAGTGCAAA 
TGCTGTACCA 
TGCAGTAATA 
GGTTAAAATA 
AATCCAAGAT 
TATTAATAAA 
CACTATAAAA 
C TTGGTT AAA 
AGAAACTATA 
TCTACCTAAT 
AAATGCCTTG 
CCTCAACCAA 
CAATAAAAGA 
CAACCTCAAC 
TCATATATTT 
TTTCTGGTTT 
TTCTAGGTTT 
TAGGAGTAGC 
AAGGCTGATT 
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AAAAGTTTTT 
AATTCCTGTT 
AAGATACTTA 
TAAATTAACA 
AAAATATCCA 
AGCAGCAACA 
TAAAAAAGAG 
ATAATTGTGA 
CCTTTTGCCT 
ATCATTACCG 
ATCTACCTTG 
AACGCCCAAC 
AAGTTCTCTT 
TAAATAGTTA 
AGAAATTTTG 
ACCAGGCTCT 
AAAATCTATC 
CTTTAAGATA 
ACCTCTTGAG 
AAACAAATGC 
TATTAAAAAA 
TCTTTAAAAG 
GACATTATAT 
ACAAAATCAT 
ATTGTCATGC 
ATTATCGTAT 
AGACCATATA 
AC T AG AAAAA 
AAAATTTTGC 
CAATCTCTTT 
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TCTTTCCACT WUGACTAGG 
TTTTCTCCAA ACCCAAATTC 
AGCCCAAGTG TATAAAAATA 
TACCCATGAC CTCCGGGCTT 
GGACAATAAA TTTTACGATC 
ACTAATTTAA AAATAGACGC 
TAATCTTCCT TATTATCTTT 
AGAGCAAGAA CAGCACCTGT 
AAAGCATTCT TAGCAAGATC 
GGCACCATAT TTTTTATAAT 
TATTTTATTA ATCCCTCTTG 
TTTCCAATCG TAGAAGTATT 
TGATTTATTT GCCCAACATA 
CGC TTAAAAG AATAGGTCCA 
AAAAGCATCT TTGGGGTAAG 
TGAAGTTTAG ACAAAATAAT 
ATCTCAATTC TAGTAGCAGC 
AACATAGTTA AATTATTTGC 
GCATTGATTT TTTCCAATCT 
TTACCAATTT GCATTTGGAA 
AATATGCCGA ACTTGTATCT 
AATAAAAATT TCTAGTAAAA 
TTACAAAAAG ATCAAGGTTG 
AAAACACAAT AGCTAAAAAC 
TAAGCATATT TTTGGGCATG 
ACCCAAAAAC AAAAAATCCA 
AAATGCTAGA TAATAATCCC 
TTAAAAAAAT ATCTATTGAA 
CTAAAAATGC GCTGGAAATA 
GTTGTTTTTA ACAAGAAAAA 



245 

AAGAAGGCCA GCTACTTCAT^WGGCAAATC 15498 0 

TTTTGCATAT TTTCTAATTC TATCAACTCC 155040 

AACATTAGAA GAATGTGCAA TCGCCTCTTC 155100 

CCAGCAATGA AAAATTCTAT TTCCAACTTT 155160 
TTTGTCTATA ACTCTTTCTT CAAGAATGGC . 15522 0 

AGGCGGGTAA ACAGATTGAA TTGCTTTATT 1552 80 

ATTGTAAACA TCTTTCATAG AATAATAAGG 155340 

TGATGGTTTT AATACTACAA CAGAACCATA 155400 

TTGAATATCT TTATTGATAT TAAGCACAAC 155460 

AGAACCATCG TCTATTCTTC TCTCCTTAGA 15552 0 

CCCTCTAATG TAATTATCAT AAACTTGTTC 155580 

ATCATACCCA CTAACATTGT AAAACGTCCT 155640 

ACCGATTGAA TGAGAATATG AATCGTCAAC 1557 00 

CAAAAGAGCA GGATAATAAA ACTTTTTTTC 1557 60 

TTCAATTATT TCAACATCTT TAAGATATCC 155820 

TGATTTATCA ATATCTAGAG TGCTTGATAA 155880 

AGGCATATTG TAATAC TGTT GTAAGCTTAT 155940 

CAAAACATTG GAATTAGAAT CCAAAATTTC 156000 

TGATAAAAAA AC ATTGGC TT CTCTGTCATA 156060 

TAAAATCGCC AAATAAAGCA CCATAATTAC 15612 0 

AAAATTTGTT ATAACACCCA CTAATAATCC 15 6180 

TAATTTTGAA TTGGATATAA AAAGTTAATA 156240 

AAAATTGAAT AATTAAAAGA TTTTAAGTCT 156300 

CATAATATAA TTTTTGAAAG AATAAAAAAT 1563 60 

AATAATTTTA TTTTATTGTT AAAATAAAAT 15642 0 

AGTGGTAATC CTGTAAAATA ATCCATAAGA 156480 

ACATTAAAAA TAAAATTCAA AGAATTAAAA 156540 

AAATAAAAAT AAGTTGCAAA ATAGTGTTGA 156600 

AAATATGTAA AAAATGTTGC CATTATTCAC 15 6660 

CATACTCAAG CTTATCTAAA ACTATAGCTG 156720 
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GCTCTACTTC TATTTTTAAA AGAGAATTAT 
CAATATAAAT ACCAACTGGA TATTCACTAA 
TTAAATCTTT TTCAGCAAGT CTATTAACGT 
TGCCTTCTAT AAGGCCTATA AACCTACTAC 
AATTAGTTAA AGGCAAAATT TTAGCAGTAT 
GGCCACTAAA TCCATCCTGA TATGCAACTG 
ATCCTTTATT AATAGCCATT AAAGTCGATA 
CCGAAATAAA ATCGCTAGAG CTTGACGAAT 
TCTCTTGCCT TAGTGACTGT ATATTCTGAG 
TATAAAATTC TATCTTGTCC TTGTAATTTT 
AAATAAAACT AAAAACCCCA TGCATTCTGC 
AAAAATTATC TGATCTTCTC TTTTGAATGC 
AAACTATCAA TACCAAAAGT ACTTTGATAA 
CTTATTCATT GATAAAACTG TAAATATTTT 
AAAATAACCC GGCACCAACA GCTACCGAGA 
CTCCAGTCTC TTTTGAAAGA AGTCTATTTA 
AAATAATGCC ACGCTCAACA ATGTCTGTAG 
GCTTAACTTC ATCCACAACA ACATTTATAG 
AATCAACAAG TTGCTTTCTA GGAAGACCAG 
TTTCTACCCT TAAATTTTGA ATATCGGGAT 
CTGCTGTCTG TTGACCAATT ATAATATTAT 
CATCAAATTC GTCACCACCA GTCCTAATTG 
TAACAGATAT TTC TGTAGTT CCACCCCCAA 
AAATAGGAAT ATCAGATCCA ATAGCAGCTG 
TTGCACCGGC ATTCATTGCG CTCTCTTTTA 
TTGGAACACC TATTACCATT CTCGGCTTAA 
TAATAAAATA TTTGATCATC TTCTCTGTAT 
GTGGGCGTAC GGCTTTAATA TTTTCTGGAG 
CAACCGCAAC AACTTTATTA CCTTTGGTTA 
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AATCAAGAAT ATGAAAATTT GTAATCTTTC 156780 

ATCCAGCAGT AACAATAGAA TCCCCTATTT 156840 

AATTCATTTC AAGTTTTTTA CCATAACCAT 156900 

TTTGAATCCT TGCGGACACA AAATTTTCAT 156960 

TAGAATAAAG CTTTACAACT TTGCCTACAA 157020 

CTATCATATC TTTTTCTATC CCATCATTGA 157080 

TGTTTGAATA GTTTAGATAT ATAATCTCTG 157140 

AAAAATTTAA TTGCTCTTTA AGACGAACAT 157200 

TGACTATTTC AAGCTGTTGT ATCCTTTTTT 1572 60 

TGTATTCATT TACAGTTTTA AAAACATTGG 15732 0 

TTTGAATATA AGAATTAAGA GTAAAAAACA 157380 

TGCTTGAATC ATGAATCATA AAAACAAGAG 157440 

AATTCTTGAA TTTGACAAGA AAATTCATAA 157500 

TACTAATATC TATTCTATTG GCATAATCAT 157560 

GAAGCGGATT GTCTGCAACA TAAACAGGAA 157 620 

AACCCTTAAG AAGAGCCCCT CCCCCTGTCA 157680 

CAAGCTCTGG GGGAGTTGCA CCAAGAGTGC 157740 

GTTCTTGCAA AGACTCTCTT ACTTCCATAG 157800 

TTACAGCATC TGTACCCTTA ATGTCTATTT 157860 

ATACATTTCC TATCTTAATT TTCAATTTTT 157920 

GAGAATTTCT CATATACTTT ATTATGCTCT 157980 

CTCTACTTAC AACCATGCCG CCAAGAGAAA 15 8040 

TATCACACAC CATATGACCT GTAGGTTCAA 158100 

CAAGAGATTC TTCTATTACT TTAACTTCTC 158160 

CAGCTCTTCG CTCAACCTCT GTAATACAAG 15 8220 

AAAATAATTT TTTACGAGAA AAAATTTGAT 158280 

TCTCAATGTC AGCAATAACT CCATCTCTAA 158340 

TTTTCCAAAG CATTTTTTTA GCATTTCTAC 158400 

TATCTATTGC AACAACAGAA GGCTCGCTCA 1584 60 
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TAACCACGCC ATAATCTTTA ^ffATAAACCA ATGTATTACA TGTTCCAAGA^WkATGCCAA 
TATCTATCAA AAAAGACTTA AACAAATTCA AAACAACCTC CCTAAAAGTC TTCCAAAGTA 
ATTCCAAGCC TCTCTCTTGC TGCTTCTAAA TAAGGATTAA GATTGAGAGC TTCTCTCCAG 
TATTTTCTAG CTTTAGGATA ATCTTTATCT TTCTTTCTAT ATATATCTCC TATTTTAACA 158700 
TAAACGCTAG AATTTGAACT ATTAATTTCT AGAACTTTAT TATAGTAATT AAAAGCACTG 158760 
TCATAATCGC CTTTATCTAA TAGAATGTCT CCATAAAGCA AATATACCTT TTGAATTAAA 
TTTTCATCCG TTTTCTCACC CTTTGATAAT TTTCTTTTCT CTTCTTCTAT TATTTTTTTT 
ATATACTCAA TGCTTTTGTT TATATCATTC AACTTATAAT TAACATAAGC CAAACTCCAA 
AGCACTAAAT CAGATTTATT TTCATTAAAA GCTTTTTCAA AAAACTTCAA ACTAGATTTG 
TAATCTCTTA AAAGCTGATA CGAATATCCC AAATATTCAA AAATATCTTC TCTAATGTTC 
ATGAAATCAA AATTATCGGC ATTTAAGGCC TTCTTTAAAA ATTTTACAGC AAGC TCGCTA 
TAAAACTCTC CTTTATGAGA ATATGCCTTT CCCAATATGT AATACAAAGG GCTTATGGAG 
ACTCCATCAT TTATAGAAAT TAAAAATCTT AGTCTTTCTA TGGATTTATC TAAAAACTCT 
CCTTTTAAAT ACCCTTCATT T AC TATT AAA GAATAATAAA AATATGAAAA^ TCCTAAAAGT 
AAATTCAAAT TAAAATCAAA TCTATGATTT TTAATGTCAT TCTCAGCATA ATCTATTATT 
TCTTTATATT CTTTTTTATC CCAGAGTAAA AGCAAATCAA CTTCTGTTGG ACCTGCCTTT 
AAATAAGAAC TAGAAAAAGA TTTAAAATAT GATAAAATGT AATAAATTAA AAAAATGAAA 
ATGAAAATCA TAAATGAGTA AAAAATATAT CTTAAATATC TTATTTCCAT TTAAAACTCT 
CTTTTAGGTC AATGAAATTA AGCTGACACC CATAAGCCGA GTTCTGTACT ATGCCATCAT 
CTCTCTTATT TTTTTATCGC TAAAAAAATC GTGCGATCTA CCCGTAAGCA TGTCCTCAAG 
AACAAAGGGT GCTTACATAC TTGATCTTGC TCCTAATGAG GTTTATCTTG CCTGTATTTA 
TTGCTAAATA AGCGGTGAGC TCTTACCTCA CCTTTTCACC CTTACCTTTT ACGGCGGTAA 
TTTTCTGCGA CACTTTCTTA GGTTTAAAAC CCCTAGGCAT TACCTAGCAT TATGTTCTTA 159840 
TTGGAGCTCG GACTTTCCTC TTAAGCTTTA ATTATAAAAC TAAGCGATGG CTGACTGCCA 159900 
GCTTAAATAA AAAGTATCAA ATTAATAAAT TTTATTCAAC AAGATCTGCT AAACTACCAG 
GAATTATTTC ATCGCTTATC TGAAC TTTAT CCTGATAATA AAGTATTCTG CTGCAATAAG 
GACAAAATTT AATATCGTTG GGCTCACGTC TTACTTTATT TGCAAATTCA ATAGGAAGTA 
TCATATGACA ACCTTTGCAA ACATTGTTAA CCAAAGGCAC AACTCCATTT GATTTATTTC 
TTATTATTCT TTGAAATTTA AATAAAAAAT CTTCATTCAT TTTAGAAGCA CAATTTAACT 
CTTCACTCTC TATTTCTAAA AGTTTCTTTT CAATTTCTAA AAGCTCCAGC TCAAAACTAC 
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TGCTTTCAGC TCTAAAACAT TCCTCTTCTT TGATGTGCTT CTCGTTGACA TCTAATATTT 
CCTTTTCTAT TTTAGTTTTA AGCCCATTAA CATGTGTCAT CTTTTTTCTA ATTGTAACTT 
CATCGTCAAT AATAACCTGA AGTTCTTTTT CAAGAGCCTC ATATTCTCTT TGCGTTTTAA 
TGCTATCAAT TTTTTCTTCA GCCTTGCTCT TTCTTGAATT AATATCTTGA ATATCTAACT 
TTAAAGCAGA GTCTTCTTTT TGATACTCCT TAAACTTTTG TTGCAAATCA ACAAGAACTT 
TCGACAATTC TTCAATCTGA TTTTTTTTCG CCTCCAAATA CTTGGGAATA CTTTTTCGCC 
TTTCTTCAAG CTCAAACTTA GATTTATATA TAACTTCAAG TTTTTTTAAT GTATCAATAT 
TGTTTTCCAT CAATCCTCCT GTTCAATTTA AATCTTCAAG ATAATCTTTT AATTTTTGAG 
TTTTTTTGGG ATTTTTAAGC CGCC TTAATG CTTTAGATTC AATTTGCCTA ATTCTTTCTC 
TTGTAACATT AAAATGAAGT CCAACCTCTT CAAGAGTTAA AGAATAGCCA TCTTCAAGTC 
CAAATCTCAT TTTTACAACT TCTTGTTCTC TTTCAGGAAG AGTTCCAAGA ATTGCTCTTA 
TTTGATCTTG CAAAACTACA AAAGATGTGT GATTTGCAGG ATTTTTTATT GCCTTATCCT 
CAATAAAATC GCTAAGAACA GAATCTTCCT CTTCTCCAAT TGGTGTTTCA AGAGAAACAG 
GTTCTC TTGA AACACTCTTT ACAGTTTTAA CCTTTTTAAG TTCCCATCCA AGCCTGTCTG 
AAAGCTCTTC ATCTGTGGGA TCTTTGCCTA AAACTTGAAT TAAATATCTA GTTTCTCTAT 
TAAGCCTATT TATTTGCTCA ATCATGTGCA CAGGAACTCT AATTGTGCGA GCTTGATCAG 
AAATAGATCT TGTTATGGCT TGTCTAATCC ACCAAGTAGC ATAGGTTGAA AACTTAAAAC 
CTCTCTTATA TTCGAACTTT TCAACAGCCT TAATCAATCC AATATTGCCT TCTTGAACAA 
GATCAAAAAA ATGAAGACCT CTATTTGCAT ATTTTTTAGC AATGCTTACA ACAAGCCTTA 
AATTAGCCTT AATCAACTGA TCTTTAGCAT GCTGCATCAT TTGCTTCCCT TTAGCAATCT 
CTTCTGACAT GCTTATTATT TTATCAGTTG GATATTCATA ATACATCTCA ATTCTCTCAA 
GTTCTTTTTG GGCAAGCTGA GCCTCTGTAA TCTGCTCTTT AATAGCATCT TCTTTAAGCT 
TGAGAGATTT TTCTATCTCT ATTTTTTTTT CAGCAATAGT CAAATCTCTT CCAAGCACCC 
TCAAATCTCT TATTTTTTCA ATTTTCAGCC TGCTTAGAAT TATTCTTTGT TGTCTTTGTA 
AATCTTTTAT TTTGTTAGCA GAGTCAATAT AATCATCTGA GAAAATCCTT AATTCCTCTT 
GATACAAAGG AATGTCTCTC AAAAGCTCTT TTAGGGCCAA TCTTTCCTTT TTTAAATTCT 
TTTCAAAAAT ATCCCCCCCA AGATCATACA CCCTATGCTT ATTATCTACA TAACTTATTA 
AACGATCTTG AATTGGCTTT AAAGGAATTT TGTAAAAAGA GGCAATTCTT TTTTTTTTAT 
TATAATAATC CGGACTACTC TCTTTATCCT TATCTTTTTC TCTTTTAAAA AACTCTTCTC 
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TTTCCATTCT 
CATTCTTAAG 
CTTCTTGATT 
GATCTTCTGA 
CCTTAACAGA 
CCTCATCACT 
CCTCTTCCTC 
TAACCAACCT 
CTAATATATC 
CCAAATGAGT 
CCTAACTCCC 
ACATTAACTC 
TGCAAAGCAA 
GCATCTCATC 
TTTATAGGTA 
AAATAAATTT 
AATAATATTG 
ATACCTCTCA 
AAATCTTTCT 
ATTGCTTAGC 
ATTTAAAAAT 
AGAATAAACA 
GGAGTCTACA 
AACACTAACA 
TCCAGCATCA 
TAAATGCTCT 
AAAAAAAGCA 
CTCCTCAAAA 
ACCTAAATTA 
GTTTCCTTTA 
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TGAGTAAATA^f^TTCACAA GATTATAATA 
AATATTCTCA ATTATACTCT CTCCAGAATC 
TCCCGTTAAT AAAAACTCTT TTCCTATTTC 
GTGACTATCT TTTAAAACAT TGCCTTTAAT 
AATATCTTCT TCATCACAAT CATCCAGCTT 
TTGAAAACCA . TCATCTAAAA TC ATAAAATT 
TTCATCATTT CCATCTTCAC TGACAACCAG 
TATTCCCCTA TCCTCAAGTA CCGAACAAAT 
ATCGGGAAGC AAATTTGATA ATTCACTAAA 
AATAATACCC TCTATCAGCT TCGAATATTT 
TGGAACATCA TCTATGTAGA TTTTTAAATT 
ATTTATTTGA ATCTTAGCAT TTACCAAAGA 
AACACGAGAA TCTAATTTTC TTC TCTTGAT 
ATCCACTTCA AATTCAGAAT TTAAAATTTC 
TC CTTTAAAT TTTTTTTTAA ATCCATTAAT 
TCAAAGCACA TAAAAACTTT TCTGGCATCG 
CGCCTTACTA TGCTAAAATA AC T AAAATTT 
TAAGAATCAT CATTATGAGC ATACAAATTT 
TTTATTCTGT AATAATCTTT CAATAAAGTT 
TTGTCTAAAA AAATTTTTTT CTGAGTATCT 
AAATTAATCA TGGCATTTAA ATCTACAGTT 
TCCAAAAGAT ATTCAAAAGC ATCACATCTA 
CCCTCACTTT TAAGAACATC TGCAGGATCA 
TTGATATTAA ACGGCAAACA AATTTGATAA 
TCCCCATCAA AAGAAAATAT TATCTCATCA 
TTTGAAAAAG CAGTGCCAAG AGTAGATACG 
AGAACATCTA TATACCCTTC TACCAATATA 
CCCTCATAAA ATCCATAAAG AAGCTCCCTT 
ATATACTTAG AACCTTTCCC ATCTAAATCT 
AAGTCTTTAA TTGGAAAAAT TAATCTTTGA 
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ATTTTCTATA^IEAAGTCCCT 162060 

CATTTGCTTT GCAAGTTCAA 162120 

CTTTAAATAA AGCTTGATTG 162180 

ATACCC TGAA CCTAAATCAT 162 240 

AACATCAATG TCAATATCTT 162 3 00 

TCTATCAGAT TCAATCTCAA 162 3 60 

ATCTAATTCC GAAATTTTAT 162420 

ACAATCAAGA ATCTCTGGTT 162480 

ACTAAGAGAT TTTCTATCTC 162 540 

CTTTTCCAAA TCCGACAAAA 162 600 

TTTTCTCTGC ATATTTAAAA 162 6 60 

GTCCCCATCA TATCTTTTTT 162720 

TGCAAGTAAA ATATGAATGA 162780 

TTCAAAAAAA ATTCACTAAC 162840 

GAAAAATCTT TATTATTTTC 162 900 

ACATTAATTA AATCACTATC 162960 

TTCAACAAAG CTACTATTAG 163020 

CTTTTATTAT TGTCAACTAC 163080 

GTCACACCAA TACCAAGTTT 163140 

ACTTTTGATA AATTTATCAA 163200 

TTATTTAAAT TATATTTATT 163260 

TTATTTAAAA TTTTTTGCAA 163320 

GTACCAAAAT CCATTCGAAC 1633 80 

GCTTTTAAAG TTGCAGAAAG 163 440 

GCATATCTTT GAATTAAAGC 163 500 

GCTCTCTTAA TCCCAGATGT 1635 60 

ACTGATTTTG TAGATTTAAT 163 620 

TTTTTAAAAA CTTCAGTTTC 163680 

CGACCTCCAA AACCAACAAC 163740 

AATAAAATAG AAACTTTGGG 163 800 
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ATTGGTTTTC GAGAACAAAC CACTTTTTCT AAGTACTTCA 
TAAAAAATCA TGAAGCTCTA AACCATTTTT AAAGTTAAAT 
TAAATCAACA ATTTCCTTAG ATATTGCTCT ACTCTTTAAA 
GTTTTTACTT AAAAAAAATT TAATGGTATT AATTAACCGA 
TGAAACCATG TCTTTATTTT CATTTTTATT TTCACTTCCT 
ATAATGAATA CCGGATTTTT CGCATAAAAT CTTAAGAGCA 
CATATCCATT AAAAATCCAA TAACATCTCC ACCCTTTTTG 
TCCTTGCAAA GGATTTACAA AAAAAGAGGG AGTCTTCTCA 
TTTGTAAGCA GATCCCGATT TAACAAGCTT AATATATTGC 
AAATTTGCTT TTCATTGAAG CTACAGTTTG TAAATACTTC 
ATAAAATTTT TAATATAATC TTTTGCCCCT AAAAGATGTG 
TGATGTGTGC CCAACTTAGA GTCTTTTACA ACAAAAAATA 
AAAAAAGCTG CTTGCAGCGA AATAATACCA GCATTTGAAA 
TTATTAATAT ATGTATTATA AGGAGAATCT ATCTCTAAAT 
GGATGGCTTC GTCCTAGCTC CTCTGTAATA ACATATTCAA 
GCCATACCAG ATTTTATTCT ATTATAAAAA ACCGAAGACA 
ACCCTATATT CACGTTCAAC AATAGATGCT ATTATTACCC 
GAATAATCGC TAAGAACAAC GCCTATAGAC TTAAGCTTAT 
ATGCGAACTA CATTCTTTAT TTCTATACCC TTATAAAATT 
AATCCTTCAA GAGAGTCATA ATCAAGCCCA AGCTCATAAA 
AAAAAAAGAA AATCTTGAAC ATCATCAATA ACAGAAAATT 
CTTCTGCTAG TATACCCTTC GGGTATTGTA ACATCAATAT 
AAAAACTCTT TATATATTTC AAATGTAGAA AGATCGCCAT 
TTAAATTGTT TATCACTACC TAAAATATAT GAAATAAAAA 
ATTAATTTTT GTTTTTTCAA TTCTTTAGCT ATTTTTTTAA 
TTAAATTCAT AAACTAAGCC ATTTGCCAAA GAAGATAAAT 
GACAAAATAG ATCCCAAAAA GAAAAAAAGA ATAAACACTT 
AGAATCTCCT AGCTAAACAA AAGTCTTTGC TAATTACACT 
TTAATTGATA GCTAACATCA TATTTTATAA CCAATAATAA 




GAAGAGTATC CTTTTGAAAC 163860 

GGCAAATAAC CAAGTTCAAA 163920 

ACATAATCTA AAGCTTTTTT 163980 

GAATTCAAAG AGTAAATTTT 164040 

CGACTTATTT TTAAATCATC 164100 

TCATTGTAAT TGATTTTTTC 164160 

CATCCAAAAC AATAAAAATA 164220 

GCATGAAAAG GACAAAGACC 164280 

TCCACAATAG CTACAATATC 164340 

ATACTTCTTA ATCCTTAGTA 164400 

AAGAATATTC TGATGAAAAT 164460 

AATATTGCGT ATTTTTTGGG 164520 

TTGGAGTAGG AGGATATCCT 1645 80 

CTGAAAAATA AATTCTCTTA 164640. 

TAGTAGCACA GGATTGTAAT 164700 

TTATTGGGGC TTCACTTTTA 1647 60 

TATTGTAAAG CTCCTTACTT 16482 0 

TCAAAAAATT ATCAACAAAC 164880 

TATAAGTATC TGGAAATAAA 164940 

TAAATGATTT TTTGTTGATT 165 000 

CCTTAAGCTT TAAAGCAATT 165060 

TTACGTTAGA AGATCCCTTT 16512 0 

TTATTAAATA TTTCCCCTCT 165180 

CAAGAAGCAG CTCGGATTTA 165240 

CTCCCCAACC TTTTTCAATA 165300 

TTAAAAAATA TATAAAAATT 165360 

TCCCAATTTT AATAAGCACA 165420 

TTAGCTATCA ATTAAAATTT 165480 

CTTTTCTAAA TAGACAAAAA 165540 
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TAAAAGCACC AAAATGAAAC^TCATAACAC ATTTAAAAGA 
AGGCAAACGC TTATAATTTC ACTAAATTGG GGAGGGTGGG 
AGCCAACAGA TTTACAGTCT GCCCCGGTTA ACCACTTCGG 
CTGTCGGATT CGAACCGACG ACTGCGGTTT ACAAGACCCG 
AAGTCGGCAA ACAAAAACTA ATAGTTAGTT TATAAAAATA 
TAATGATAAA TAATTAATAT AATCTTCCTA TTAATC T ATT 
TTAAAAACAA GAGCAATAAA AAAAATTATA AAACTAAAAA 
ATTTTTTTAT TTTTCAAAAA ATAAAAAAAT ATCAGTCTAG 
ACAGAGCTTA AAATAATAGC AAGATCAAGA TAAAATTTCT 
TTGTAAGAAC CAGCCAAAAA CACGAACAAA CTTAACAAAG 
ACCAAAATTC TAATAAAATT TGAAAAACGT CTTAGATTTT 
AAATTTATAA ATTCTTTATT ATAAAAAGTG CTTGATTTAT 
TAACATCGGC TAATGTAATT GTTTTATTTT CAGTTACCAT 
AAGATTTTCG AATAAAAACA TTATGATTTT CCAACAAAGA 
CAATAAAATA ATCCACAAAT CTACTTGAAA CCTTATCAAT 
ACCACTTCAA AATCCTTTCC TTATAAGAGT TATCAATTTT 
CAAACTCTTC ATTTTTATAA TCAATCTCAA AGTAAATATT 
CAGTAATATT ATTTTCAATG TCAACATATA TATTTCTCAA 
CGACATTATT TATAGTTCTT AAAACTTCCA TCATTCTAAC 
AATCATTAAG AAAATATAAC TTTTTAAAAC TATAACGCCC 
TCTCGCCTTT TAAATTTAAA ATTAAACTAG GATCTCTAAC 
TAACAAGAAC TTTTAAAGAG CTATCACGCT TTTTAATAGC 
ATAAAATGTT CTCATAAAAA TTTTTTACAT CAAAAAAGGG 
CAACAGGATT TTTGCAATAT ACTCTAAAAA TATCTAAATA 
CAACAATCTT AGATACAATA GATAAAACAT TTTTAAATAC 
GATCTTCTCT TGAAATTATC TCTGAATTAC TAATATCATC 
ATATAATGCT TTTAACAATA TCCTCATCTA TCTCAATATC 
CAAAAAAAGA ATTCAAATAC TTACTAACAC TTGAAAAGCT 
AGTCAACAAT GCTAATATTC TCAGCACTAT CTAAAAGATC 
AAGATTTGTA AGGGAAAAAC GCCAAACTAT TAAAAACATA 
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TCACAATAAA^KtTATAAAA 165600 

ATTCGAACCC ACGTAGGCAA 165660 

TACCGCCCCA TAAAAGC CGA 165720 

CTGCTCTGGC CAGCTGAGCT 165780 

AAATTCATTA GTCAATAGAA 165840 

TATTACCAGA AAGAAAAATT 165900 

TTAAAATTAA ATAATACTTA 165960 

TAGATTCAAG TCCAAAAGAA 16 6020 

GAAAATTGTA AACAAAAATA 16 6080 

AATTTAAAAA CAATACAATA 166140 

TTACTTTAAA AGCTTTCATG 16 6200 

GATTTCCAAA AGATTGCCAA 166260 

TGATTTGTAA GAAACCGTTA 16 632 0' 

CAAGCACCTC TTTTTAAGCT 166380 

AGAAACGGAT TCAGTTAAGC 166440 

ACCGTCTCTT TTATACCTTT 166500 

ATAAATTTCT TTTTTCAAAG 16 6560 

ATCATGATTA TTGATTAATA 16 662 0 

ATCATATATG TTTTTAAAAA 16 6680 

AAGATTTAAT TCATTAAAAA 16 6740 

AAGATTTTTT ATTTCTTTGA 166800 

CTGACCCATA ACATTTAAAA 166860 

TACATATTTT TTAGGAGATG .166920 

AGGCAAAGTC CTCGACAAGC . 166980 

ATATTCTACC CGAGATATAT 167040 

AGATATTGAA AAATAATTCG 167100 

TTTTAATGTA TACAAAATGT 167160 

GGTTCCAAAT CCAACCTCAA 167220 

TACATTAAAA AATGAAAAAA 167280 

ATAAAACTCA AACATATCCG 167340 
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AGACGATTTT 
GAGATTCTTC 
CATCTAAAAA 
AAATAGTTTG 
CCATTGAATA 
TCTTAAAATC 
TAACATAAGC 
TTAACCACAT 
GCTGAGAATC 
AACCTAAGCT 
CTATAGAGTT 
AAAATAAAGT 
AAAAAATTAT 
TAATTAAAAT 
CTTTAAGAGA 
CTCCTGAAGA 
CAATTGGTAT 
AATATGAAAA 
TTGAAAGAAG 
TAGATTGCTA 
CAAACAAAAG 
CACTTAATTT 
TTTTAATGAT 
ATATACAAAA 
TAGAAATAAA 
TAAAGGCAAT 
CTTTACTTAA 
ATAATAACAA 
GCACAAGAGA 
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GTAAGTACTA GATGGAATAG AATTAATATA 
TAAATCGTTT TGGGTTTTTT CTTTTTTTAA 
ATCTTCCAAA TTGTTCTTAG AGTTAAGAAC 
CTCAACAACA ATACCACCTT TTTCTAATTT 
TCGGCAAACA CTAAAAAACA TTTCCACAAA 
TATTACTGGA TTTTTACACA TTTCATTAAT 
TTTATATATA TCTTCTTTGC TTTTATCTTT 
TCTAATAAGA AATGATTCTT CCATTAAGCG 
AACATAAGGC TCATTACTAC TATTGGGCAA 
TTTTCTTATA TCATTCATCA ACCTATCTTT 
GTATACGCTA TTGTTATTAC CACTCATAAA 
ACTACTTACA ACATACTTCA AATAAATTAA 
TTACACTATA ATATAATTTT ATATGAATCT 
TGGTAAAATA TTTAAAAAAA ATAACTACGA 
CTTACTGCTT AATAAACAGC CTTACGATTT 
AATAATAACA TTATTTCCAA ATAACATCAA 
TATTTTTAAT AAAAAAATCT TTGAAATCAC 
CAACAGAGCC CCCAAACAAG TAGAATATAC 
AGATTTTACA ATTAATGCAA TTGCAATGGA 
TAATGGGAAA AAAGACCTTA ATAAGAAAAT 
ACTTGAAGAA GACGCCCTTA GAATACTTAG 
TAACATTGAA AAAAATACTT TAATTTCAAT 
TTCAAAAGAA AGAATAAAAA ATGAATTTCA 
AGGAATTTAT TATCTTAAAA AAGTTGATTT 
AACAAAAGTA ATCAAAAAAA TTGCTCTACT 
CACAATATTG ACAATTAAAA AACCTATAAA 
ATTCTCAAAT AAAGAAATTA AGCTGATTTT 
TATTTTTAAT GTCAAAAAAT TAAGTGATAT 
ACATTATAAA GAAATAATTG ATATATACAA 
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TTCATTTATC TTGACCTTAA 167400 

AAAAAGCTCA TATTCACCTT 167 460 

TTTGCTTTGA ATAATTTCTA 167 520 

CCTAAAAAAT ' TCTTTC AATT 167580 

ACCAACATGC AAACATTCTC 167 640 

TTGATTTTCC AGGTTTTTTA 167700 

CTGAAAAAAA TTTAAAATAG 1677 60 

CCCCGAAAGA AATTTTTCAA 167 820 

AACAAAATCG GAAGAATTCA 167880 

AGTCTCTTTT GAAAGTTGAC 167 940 

ACCCCAATAA AACCATTTAA 168000 

AATTGTAAAT CAATTTTAGA 168060 

AGGGAAAAAC AATCCAAATA 168120 

ATTTTATTTA GTTGGAGGCG 168180 

TGATTTTGCA ACAAATGCAA 168240 

AACAGGAATA AAACATGGCA 168300 

CACATACAGA ATAGAAAAAG 1683 60 

TAAAAATTTA CTTAAAGATC 168420 

TATTTTCAAC TTCAACATAA 16 8480 

AATAAGATGC ATAGGAAATC 168540 

AGCAGCAAGA TTTTCATCCA 16 8 600 
GAAATATAAA AAAGAAAATA . 168 660 

CAAATTGTTA GAAGGCATAA 168720 

TTTTAAAAAT TTTTTTAATC 168780 

TGATAAAAAC AAATTTTATC 168840 

AGAACTAAAA GAAAAATTAA 168900 

ATTTTATAGA GGCATAATCG 168960 

TAGATATTTG CTTAGCAAAA 169020 

AGCACTCAAA GGAAAAAATA 169080 



WO 98/58943 



PCT/US98/12764 



AAAGATATTT 



ATTTATAATA 




tCATAA 
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AAAGAAAAAA 



ATTGCTAAAA 1 




ICTCTCT 



169140 



CTTTAAAAGA TTTAAAAATA AACGGAAAAG ATATTCAAAA TCTAGAACAA ATAGAAAACA 169200 

AAAATATAGG TAAAATTTTA AATATGCTAC TAAGATGTGT AATTGAAAAT CCCAAGCTTA 169260 

ATACTAAAAA TTATCTTATA AAAAAAATCA AAACCTTAAA GGTTAATGTT ' TTCC ATAGCT 169320 

TTTAAAGCTA CTTCAGCCGC TCTCATTTCG GCTTCTTTTT TAGATTTGCC CTTTCCATTT 169380 

GATATAAAAT TTTCTCCAAC ATAAAGTTCC ACACAAAAAA CTTTATCATG GTCTGGACCT 169440 

ATTTCCTTGT CTAGCTTATA ACTTGGCGAG ATTTTATATT TCTTTTGAAC ATATTCTTGC 169500 

AACAAACTCT TATAATCTTT AAAATCCCCC CTATTAAACA TCAATCTTAT ATACATATCA 169560 

AAAAGTCCAA CCACAAATTC TGTTGCTCTT GAAAACCCAC TATCAAGATA AATAGCGCCT 169 620 

ACAAAAGCTT CAATAGCATC TGCAAGAATG CCTTTTTTAT TTCGACCATC ATTACTCTCC 169 680 

TCCCCTCTAC CTAGCAAAAT ATAAGAACCA AGATTAATCT CTCTAGCAAT ATTAGATAGG 169740 

GAATCTTCAC TAACAATATA AG ATC TGGCC TTACTGAGCT CTCCTTCACT TTTATTTGGA 169800 

TAAGTTTTAT AAAGATGATC TGTAATAATC AAATTAAGCA CAGAATCTCC CAAAAATTCT 169860 

AATCTCTCAT TATTACTAGA TTTTTGATCC AACTCATTAG AATACGACGA ATGACACAAT 169920 

GCTGTATTCA ATAAATCAAA ATTACTAAAG TCAATGCTCA AATTTTCCAA AAATTTACTC 169980 

AATTGAGATT TTC TTTCATT ACACAAACAA AAATCAGAAG ATTTTTTTTT CATCAACCCT 170040 

TTCTCTTTTT AATAAAATTA ACAACATCGC CTACCGTCTC AAATTCATTG GCTTCATTCT 170100 

CTGGAATCTT ATCATCAAAG GCCTCTTCAA GCAAATACAA AAGCTCATAA ATATCTAGAC 170160 

TATCTGCATT AAGATCTTCA ACAAATCTAG AGTCTGTGGT AATTTCATCT TCTTTTTTAT 170220 
CAAGTTGCTC AGATATAATA GACCTAACCT TGCTAAAAAT TTCATCATTA TCCATGAATA . 170280 

CACCTTCCTT ACTACAAGCT ACAAACCTAT TTCTAGATAT TGGTTATTCC TATAATATCC 170340 

ACATTTTAAA CAAATCCTAT GTCTCACGCC AAGATTACCA CAATTAGAAC ATTCTTGAAA 170400 

TTGTGGAATT TTTTTTCTCA TATTTATACT CCGCCTTGTT CTACTTCTAG ATTTTGAAGG 170460 

CTTAAATTTT GGAACAGCCA TTACTTTCTC CTAACTACTT TTAAACTACA TTAAATTATC 170520 

TTAATAATAT ATATATTAAC CAAGTCATTT GTCAATAAAC TTAGATTTTA ATCTATTAAA 170580 

CACCAATTCT GGAACAAAAT TAGAAAGATC AACATCCTTT TTCAACATCA ATTCCTTTAC 170640 

AAAATCCGAC CTTACATATA AATGTTCTGC ACTACTTGGT AAAAATATAG TATCAATTTG 170700 

AAAATTTAAC TTATTATTAA CAA^ATATCT TTCAAACTCT ATATCAAAAT CATTAAAAGC 170760 

CCTAATTCCT CTAACAATAA ATTTAATAGA ATTAATTAAT GCATAATCAA CAATAAACCC 170820 

GCTATACCTA TCTACAAGCA CATTTGAAAA ATTTAAAGAC GAAATAACAT CTTTTGTAAG 170880 
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GCTAAACCTC TCAATATCAC TTAGGAAATA TTTTTTTGAT TTATTTTTAG CTACTAAAAC 
AATAACTTTG TCAAAAATAG CCAACGATCT TTTAATTAAA TCAATATGAC CCCAAGTAAT 
TGGATCAAAA GATCCTGGAA AAACTGCCAC CCTCATATCA AGAAAGCCTA AACCCCCTAT 
TAATATCTTT GAGAATTTTT TTACGCCATA GTGGTTGTTC TTCTAAATTT TCAAATTCAA 
TTTCTTTAAT AATTTTGAAA TATAAATATT CTCCTCTAGA AGATGTATCC ACAATAAAAA 
ATTTACCAGC ACCTTTATAA ACTACTTTTT CATTACTTGA AAGATTATTT TCAGATGTTA 
ATTTATTGAT AACACAATCA AAATGAACAG GTTTATCTTC ATTTTCAACC TTTAAAGACA 
TGCTATAAAC AATATCATTT ATTTTTTTTT CACAAATAGC ACAAGCAACA TCAAGATTTA 
TTCTTGTTTT AAATTTTATA CTTTTTCTTC TAAATTTACC TTTAAAAACA TTTTGTTTAT 
CTGTATTCAA GACAGTCGAT TTATTATTAA AACTGTTTAG CGGGGCAACA TTTTGACTTT 
TGCTCTCAGA ATTTAATCTT TTATTCTTAA AACC ATAAAA ACGCTTATTG TGTTTCTGAA 
AATTACTACT CTTGAACTTA CTTTGAGTAT ATTTCAAAAT TAACTCTCCT TAACTATAAA 
AAATAAATAT TTACTCTTCA AAAAAATAAA AACAAATATT TAAGAACAAA ATCACTCACA 
TAGCCTGAAA GTTTTTCAAA AAATCAAAAC TTTAAATCTT CAGCTTTTTA AAGACAAAAG 
CACAATTAAA CCCTCATTGA AATTTAATAA GAATGGTTAA TTAATGCTCT CAATTATAAT 
AAATACACAA CCAAGCTTGC CAAAATTATT TCCCAAAACA GCAAAAACTC TACTAAATCT 
TGAATAAAAA TTTACAAAAC AAAAGATCAA TCAAAAACAA TTTTTAACCT CAATCTAATT 
TGATTCATCT TATGACCTAA TTTGCTAAAA TATTTAAAAT AATTAAAAAC TAATGCTTAA 
CTTTTTTAAT AGTAAGGCGC TCCTTTATCT TCATAGCAGC CTTACCGATT CTATTCCTCA 
TATAATAAAG CTTTGCCCTT CTAACTTTTC CCCTTCTTAA AACTTCAACC TTTTCTATAA 
TAGGAGAATA T AC TGGG AAA ATTTTTTCAA CACCTATTCC TGAAGAAATT TTTCTAATCA 
AAAATGTTTT GCCAATTCCC TTGTTTTGGA AAGAAATAAC AATCCCTTCA AAACTCTGCA 
ACCTTTCATT ACTACCCTCA ATAATTTTGT AAACAACCCT C AC AGTATC T CCCACATTAA 
AAACAAAAGC CTCATTTTTC TTATTCTGAG CTTCAATTTT TCTTATCAAA TCCATTATCT 
TCTCC TATTA TCTCTAAATA TTTAAGGTAT AAATCATATC TATTTTTCTT AGTTTTTTCT 
CTAGCTTTAA CAAGCCTCCA ATTCTTTATA TTTGCATGAT GTCCCGAAAG AAGAACTTCT 
GGGACCTTTA TCCCCTTAAA ATCATAGGGC CTGGTATAAT GAGGATATTC AAGCAATCCA 
TTTTTTACAC CAAA1>GATTC TTCTAATAAA GAATTGGGAT TTATTACTCC ATCTAGCAAC 
CTATATACAC TATCTATTAA AACAAGAGCT GCAATCTCTC CTGAAGATAA AACATAATCT 
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CCAATAGAAA 
CTTCCACAAA 
TACTTTATCC 
AGAGCAAAAG 
TCATCACATC 
ACTATTCCCT 
GCTGGAAAAA 
TTGTTTTTTC 
ATTTAATACC 
CTTCTCCTAG 
ATTCGCCTTC 
GCGGCCTAAC 
TTATATTAAC 
CTTTATTGTT 
CATACCCATT 
AAATTTCCAA 
TAGCCCGCGC 
TTAATTCCAA 
TATCTACAAG 
TTTAAACCTC 
CACTTAAAAT 
GGTTTTGCTT 
GAGACGTAGA 
TCAATCTTAT 
TATTTTGAAA 
TTATGAGCTT 
TTGAGGGATT 
TAGCTTCTTC 
AAACACCTGG 
ATTGACTCAG 
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TCTCAAAATC^HCATACAAA TCTATAATAC 
TTATAACAAT * TTCTTCTCTT TTTGACAAGG 
CAGAAGGACT TAAAAATATT GTCGTTTTCT 
AAATCGGTTC GGCCTTCAAT ACCATCCCAG 
TTTTATGCTT ATCTTTTGAA AAATCTCTAA 
TATTAATAGC TTTTTTCATT ATTGAATTTT 
GGGATAAAAC CGTAAATTTC ATTTTAAAAG 
TTGAGTATTT ATATCTCCAA TATATATACT 
CACTCTGACC TC AAGAAAT A CACTATTTAA 
TTTTTTATTA TTATTAACAA TGGCATAGCC 
TTTTAAACTC GATGCAAGCG AATCATCAAC 
TGCCTCTGGA GTATCAATCT CTTCAAACTT 
ATCTACAACT TTAACTTCAA CACTGGAACT 
TTTTAGATTA ATAAAATCAC AAAAATTATT 
AACTCCATAA GACGATAATA TAACGCCTTT 
TTGCACTCGC CTATTGGTTT TGGCAGCACA 
AATACGACCC CGTCTTCCGA TTATCTTGCC 
AATAGTTGAT TTTTCCCCTT CAATTACATT 
AGACTTTACT ATAAACTCTA TAAGTTCAAT 
CTGACTTTTC GCATTCAAGT TGTTTTTATT 
TGCTCCCTTG CTTATCCAAT CTTTCATTCT 
TTCAACAGGA TGATAATAAC CAAGTTCTTC 
ATTCATAACT ACAACCCTAT AATAAGGTCT 
CTTAACGCTC AAATTTATTC CTCCTTATTT 
ATCTTTATTT TTCATTTTTT TCATAATCAA 
ATTAACATCA AAAACAGTTG TTCCACTTCC 
ATTCAAAATC ACTGGATTTA TTCTTTCTTT 
TTTATTAAAA CTTTCTTCAT TTAAATTATT 
TAAAAAACTT ACAAAATTAG AAAACCCTCC 
ATAATCTTCA AAATTAAAAC TGGCTTTATT 
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GTTGATCAAT^^tTTCATAT 172 680 

AATACGCCAA CTCTTGGCTA 172740 

TGGCAGACTC TACATGCTCA 172800 

CACCGCCTCC ATAAGGCAAA 172860 

CATCAACAAG CTCAAAACTT 172 920 

CAAAAAATGG CTTAATTATT 172980 

ATCTAGAACC TTAAGCTCAA 173040 

TAAAAAGGGA ATAAAGAAAA 173100 

ATATTCAAAG AAAGCTACAA 173160 

AATAAGCTTT CCTAAATAAT 17322 0. 

CCACAATTCA AAACCAATCA 1732 80 

CAAAAATAAG GAATTACCCT 173340 

ATTGCTTTTT TTTAAAAGAA 173400 

GGATATGCTT TTAACCCTAG 173460 

AATAAACATA GATTAATCTA 173520 

AGCTCCAAGC AAAGTTCTAA 173580 

CACATCACTT TGAGAAACCC 173640 

TAACTTTACT TCATCTTCTT 173700 

CTCATTCCCA TACTCTTTCA 1737 60 

TAAAAGCATT TTCACTGTAT 173820 

ATCTTCCTTA ATTTTTATTT 173 880 

AATTGCTCTA CCATCTCTAG 173940 

TTTTTTAGCT CCCATTCTTT 174000 

TCCCAAAAGG GATGCAATCT 174060 

AGTTGCTTGA CTAAACTTTT 17412 0 

CATGGCTATT CTTTTTTTTC 174180 

TTTAGTCATA GAAAGAATAA 174240 

GCTATTCAGC ATTGATTTTG 174300 

TACTTGTCTA ATGCGCCTAA 1743 60 

AATTTTTTCT TCAAGCTTAA 174420 
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TAGCCTCTTC TTTGTCAACA 
TGCCAAGAAT TCTAGAAGCA 
CTCCAACACC AATAAATTTA 
CCCCCCTAGT ATC TGAATC A 
TAAATTCCTT AGCAATATTT 
TTTCTGCGGG TCGCAAAATC 
TTTCAAGTCG TCCTCTAGTA 
TCATAGACGC TTTAACAATT 
CACCTACTTG ACCACCCAAT 
CAGCTACAAG AAGTACTTTT 
TCGTGGTCTT GCCAGAACCT 
GATGTAAACT AAGCTCATAA 
TTTTAATAAA CTGAGATTTA 
CAATTATAGA ATTTAAAAAA 
AATTTTTAAT AATCTCAATA 
AAAGATAGTT TATAAAATTT 
CTTATTCTCA AAAATAATCT 
ATCTCCTCCA GTCAAATAAA 
AACTTCAGTA TCTAATGAAC 
CTCTTCTTTT AACAAATCAA 
ATAATTAAAA TCAAGCATTC 
ATCTTGCATA ATTTTATTTA 
TGTAACCTTT TTCAATCTGC 
AAATTCTTTG GAAATCAATC 
CTTATTACCA CAAGTTGGAC 
CATACCTGAT TTATTATTAA 
AGCAGTATCT GTGTAGTCAA 
AAGAGATAAA TTTTTAACAT 
TATTCTCTTA ACAGGAACAT 
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ACACTTTGAA CCTTCTCTAC 
ATTCTTTCTG GGTAAAAGGA 
ATGGGAACTG CACAAATACT 
AACTTAGAAA ATATTGCACC 
ACAGCAACTT GCCCCATCAT 
CCCTTTATTT TTTTTATCTC 
TCAACTATTA CAGAATCAAA 
TTAATAGGAT CTTTTTCCCC 
ATTTTTAACT GTTCTACGGC 
CTATTTTCCT TTTTAAGCTT 
TGAAGTCCCA ACATAAGAAT 
TTTTTGCCTC CCAAAAATTT 
GGATCAATGC CCCTTAAAAC 
CGCCTTATAA CTCTTAAGTT 
GCCTCTGCAA TGTTTTTATC 
CTAAAATTTG ACCCCAAACT 
TGTCAAATAT AAGATACAAA 
ATTTATTAAA AATAACACTG 
TAAGCTTTAA TACTATATTC 
AAAAAGCTTT TAATTTAAAA 
TTTGAATATT AATAATAATT 
TAAAATCATA TTTTTCATAA 
CCTCATACTT TTCATAAAGC 
TCTGCAAAGC AAAATTAGAT 
AATTTTTTTC TCCCTCATAA 
AACCAGGGTA AACATTGCCG 
AAAACATAAT ATT ATC TATA 
AACTTTCAAG ATAAACTGTT 
CTTTTTCAAT CCACGATCCA 
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AAGACTAACA ACATCCCCCA 174480 

ATCAAGATCT TCGATTTTCT 174540 

TTTAAAAGAT AATACAGCTC 174600 

GGTAAGTCCA ACATTCTCAT 174660 

AGAGTCTACT ACTAAAATGG 17 4720 

TTCAACCAAC AAAGATTCAA 1747 80 

AAAATTAGAT TCAGCAAACT 174840 

TTCAATTGAA AATACTGGAA 174900 

CGCCGCTCTA AAAGTATCAG 174960 

TAAAGAAAGC TTGGCGCATG 17 5020 

ATAAGATTGC TTATTGGCAG 17 5080 

AACAAGATTA TC ATTGACAA 175140 

TTTTACTCCC TTGGATTCTT 17 5200 

AACATCAGCA TCAACTAAAG 17 5260 

ATTTATCGTA GATTTTCCAG 175320 

TTCAAGCATT AAAAACAACC 1753 80 

AAAAGGCAAT ATTAATCAAT 175440 

GAAGCGGGCC CAACAACAGA 175500 

TCTTTATTGA GCTTTTTAAT 175560 

CTTTGACCAT AGAGTAC TAA 175620 

ATTGCCAAAT ATTTAACAGT 175680 

AGAGAAAAAA TGTCATATAT 175740 

TCAGGAATCT CACCATTCAT 175800 

ATTAGCATAT TGACACAACC 175860 

TCAATTATCA TGTGACTAAC 175920 

CCTGACCAAA TCGAAAGTTC 17 5980 

TTTTTACCCA TAAATTCAGC 17 6040 

AGTGAAAAGT ATTCCTCAAG 17 6100 

TAGCTATCAT TAACAATGCC 17 6160 
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TCCTTTATTA 




ICTGTAAT 
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ACTAAAGCCT 



AAGCCAATAA 




'ATCTCT 



176220 



TGAGAAATTA TGTTTCCAAA TAATTTCTAT CATATGATCT TTTATTTTTT CTAAAATATC 17 6280 

ATAAGCGCTA ACTGGGGGCT CAAAAGAATG AGTCTCGCTT ATTAAAACCT CGCATTTAAG 176340 

ATTGGCAATA CCTATTTGAA AATAATTGCT AGAAATAATA ACTCCCATTG AATACGCATA 17 6400 

ATCTTTATTA ATATCGAGAA GTATTTCTTT TCGTCCATGT TTTTTAACAT CAGACACCCT 17 6460 

AGAGCCAACT TCAATCAAAA GATTTTCTTT TATCATTTGA TTAGTCAAAA TAGTAACTGC 176520 

AGCATTTGTC AAGCTTAACT TACGAGCCAG GTCTGTTCTT GAATATTGCA TATTTTTCAA 176580 

ACTAAGAAGA ATTTTTCTTC TATTTCCGCC TCTAATTGAA ACCATATTTT CACCCTGCAT 17 6 640 

AATACACCTC CTTTATTTTT AAAATAAAAA AAATATTATA AAATATCATA TCAAAAAAAC 17 6700 

CAATACAATA TTTTATCTAT TCTTAAAAGA CAAACATGCC TTTATAAGGC TAAAAAACAT 1767 60 

TTTACATCAT AATATCACAT TCATAAAATA TCAAAAACTT AAAGCTTAAC AAAAAAGGGA 17682 0 

ATAAAATCAT TTTTACATAA AAACTCATCA ATAAAATTAA TTGGATTTAA AAATAATAAA 17 6880 

TACAAGAAAA GCCATTTTGC CTTAAAAACC ACTAACTTTA ACTTAATTTT TCCTTAAAAT 17 6940 

AAGAAAATTC CATAGTAAAA CTGCCCCTTC CTTTAGTAGA ACTTCTCAAA ATAGAAGCAT 177000 

ACCCAAAAAG TTTCTCAAAC GCCGCCTCTG ATTTTATCAA GTCATACTCT CCAATATTGC 177060 

TAACTGAATG AATAACACCC CCCATAACAT TTAATGTAGA AATAATTTCG CCTGTATGCT 177120 

CAATGGGTGT TCTAATTTCT AATAACATTA TTGGCTCAAG TCTAATAGGA TCTGATTTTT 177180 

GAAAAATACT ATGAAAAGCA AATCCTGAAA TTGACTCAAA AGCACTCTCG CTAATCTTAT 177240 

TGGCCCCACA AACAATAGAA AAAATACTAA CATTAATATC AATAATGGGA TATCCAAAAA 177 300 

TTCCACTTAC AAATGCAGAT GTAATTCCCC TCAATATTGC AGACTTAATT ACAGGTTCAA 1773 60 

TGCCACATTC AAAATCAATT TTATTTCCCG CACCCCGCGA CAAAGGTTTA ATGATCATTC 17742 0 

CAATTTTAAA ATCAATATTT TTGCCAGCAA AAATATTGTT AAACTCAAAA ACTTCTTTTA 177480 

CAATTTTGCC AGCACTTTCT CTGTAACTTA CTTGAGGCTT TCCTGTATAA ACATTAAGAT 177540 

TAAATTCATC TTTAATTCTT GTTAAAATAA TCTCAAGATG TAATTCACCC ATTCCAGATA 177 600 

TAATTAATTG CCCTGTTTCT TTACTCTCAG AATAACTAAA GGTAGGATCT TCTTTAGATA 177 660 

TTATTTCAAA AATTTCCTTA AGCCTAACCT CATCTGATGA TCTTTCAGGC TCAACAGACA 17772 0 

TTAAAACAAC CGGCTCTGGA AACATAACAG CCTCAAGTAA AACATTATTA TTCTCTTCAA 177780 

CAAGAGTATC TCCTGTAACA GAAAACTTTA ATCCCAAAAC AGCACCAATA TCGCCTGTTT 177840 

TTACAAAATC TATTTGTTCA TTTTTATTTG AAAAAACTCT AAAAATTTTT GTAAACTTTT 177 900 

CACGCTTACC ATTTGAAGCA TTGATAATTT TTTTATTAGG ATTAATCTCG CCAGAATAAA 177960 
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CTCTAACAAA ATAAAGATGA GAAGCAATCA CGCTTGAATA 
CTGACAATTT TTTATTTTCA TTAGGATCAA CTAAAATTTT 
AAGCGCTAAA ACTTTTTTCA AAAGGACTTG GCAAGTAATC 
GTTCTATTCC AATATTTTTT AAACTAGTTC CCATTAAAAC 
TAGTGCCTCT TCTAATCTCT CTTTTAATAA TATCTAAACC 
ATAATTGAGT AATTTCTTCA CTAAATTGGC TAAGAATGTC 
GAATCACTTT TTCAATAAAT TCTTCTCTAA TTTGACTATA 
TTTCCATTGA AAAATGAAGC TCTTTATTTA AAATAATATC 
TTTCATTTCC AATTGGAATT TGCAAAACCA AAGGAATAGT 
CTCCCACAAC TTTAAAAAAA TCAGCACCTA TTCTATCCAT 
GTGGGATTTC GTATTTTTCT GCCTGTTTCC ATACAGTTTC 
CAACAGCGCT AAAAATAACA ATACCCCCAT CAAGAACTCG 
CTGTAAAATC CACATGCCCA GGAGTATCAA TAATGTTTAT 
AAGTAATGGC AGCTGAACTA AT AG TAATTC CTCTTTCTTG 
TAATAGTGTT TCCTGAATCT ACATCCCCCA TTTTATGACT 
TTCTTTCTGT GGTAGTAGTT TTTC C AGCGT CAATATGAGC 
TAC TC AT AAA TCCCCAACAA CTACCACAGC CTCAATGCAG 
TAAACCTTTT AATAAAATTG CTATTTTGCA TCACACTCGC 
TCGTGACAAA AACAATCGCA ATTTTCAAGC AATGCCTTAT 
AGATCACTCA TGATATCATC CATAATATTA GCAGTACCAT 
AATTTTCTCA TTCCAAAAAT ATTC TTC AAA ATCTCAGTAA 
ATTGAGGGCA AAAAATTAGA AGTTGATTCA ATATCAAGCT 
ATAAACTCAG AATATCTAAA TTCAGAATCA TATCCAAGCA 
ATATCAATAA TTTTTTCAAT ATATTCATAA AGTTTTTGAG 
AAATTGGTAT CTTTTATATT CCAATGAATA CCTCTTAAAT 
CTTGCTAACA ATTCTTGTAA TTTTAATTGT ATTGCGTCTA 
CTTAAATACT TTTCCATAAC TATCTCCTTT ATATAATTAT 
AATTATGGTT TTAATACCAT AATAAATAAA AAGGATATTT 
TTTTTACACT GTTTTTATCT CAAGCATGCA ATTTAAGTAC 
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TTGAACTTTA AAAACAAGGG 178020 

TTTATTTGTG TCTAAAGAAA 178080 

TACAATCGAA TCAATCAAAG 178140 

AGGAATAATA AATC TAG AAA 17 8200 

AATCTCTTTG TCTTCAAGAA 1782 60 

TATTAATTTT TTTTTAAAAA 17 8320 

AGTTAATTTT GGAATTCCAT 178380 

AACTACTCCT TCAAAATTGC 17 8440 

TTTAAATTTA TTTTCAATAT 17 8500 

CTTATTAACA TAAGCAAGTC 178560 

TGTTTGGGCC TGAATTCCAT 17 8620 

AAGAGATCTT TCAACTTCTG 178680 

TTGGCAATCC TTCCAATGAC 178740 

CTCTTGAGGC ATCCAGTCAG 178800 

TTTGCCAGTA TAATAAATAA 178860 

CATAATTCCA ATATTTC TAA 178920 

ATAAATTCAT TAGCACAGGT 178980 

AACATTTGCT TTCATTCTCA 179040 

GCATCCATAA ATGTTTCTCA 179100 

AATCACCAGC AGTATCAATT 179160 

GACTGCAAAC AATGC TTTC C 179220 

CCTTAATAAA GGATTTTTTC 179280 

TTCTTGAGCG TTCTGCAACA 179340 

TTTTTTTGTG AATAACAAAG 179400 

TAGAATAAAA AATATGCAAA 179460 

AATCATCCTT TTTTATATAG 179520 

TATAATACAT AATGAGATAT 179580 

AATGAAAAAA TTGATTATAA 179640 

AATGC ATAAA ATAGATACAA 179700 
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AAGAAGATAT 
ACCATCTAGA 
GAGAAAATAG 
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GAGTAGTTAA 
CCGATAAAAT 
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TTAGTAATAA 
CGGGAGTTCG 
AAAGCAAATC 
CTTTTATTAG 
TACCAGAAAG 
ACAACCCCAA 
CATCCTTATT 
AAACAAATAA 
TTGTTAAATC 
TTGAATGTAA 
TAATATCCAT 
TTTTATTACA 
G AC T AGC AG A 
TAGGATTAAC 
TGGGTTTTCC 
GAACTGTAGC 
TTTTAGCAGG 
TAAAACAATT 
AAAAGTTTCG 
AAAACTTGAA 
TTTAGCCCTT 
CTTTTTCATC 
GTATTCACAA 
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G AAAATTC T A Wl TCAGAAA 
AATAGATGAT ACCCTTGAAA 
AACAATAACT CACACCCTTT 
ATCCTTTAAT TTAACAAGAG 
TGCATGGCTT AATAGTCCAA 
AGGTGGC TAT AGATTAAAAA 
AAGAAAATAT AAGAATTGAC 
AAGGGCTCAT AGCTCAGTTG 
AATC TCTCTG GGCCCAAAAA 
TATTTTAGAA TCTTTAAGCA 
GCTTTCAAAA AGAAAAATAA 
AGCGCCCAAT TTGATTCCCT 
ACGCTTTAAA CAACATTGTG 
CTGAAAAGAA GCATAAGAAT 
ATAAGCATTT GATAATAGAT 
TCTTTTTATT GCTCCTATTA 
AACATTTTCT TTTTTCTCTA 
TCCAGAAGAT TTGCCATGAA 
ATCCTTATAT TCAAAATGAC 
AGAACCAAGA CCAACTCCAA 
TTTGCTATAA TTTGAAACAA 
TAAATATTTC CAATTTTTAG 
GCCAATAACT GGGAACCCAT 
CTTTCTTATT CTTAGCATTT 
GAAATAATTT TTGACAAATT 
TTTCCGGCTC CCAAAGCTTT 
GGCAAAGCTG CCGAAACTCC 
CTTAAACTAG AAATTAATGC 
TCCAAATTGC ATTTCAATAT 
ATAGACGTAG TAGTTTTAAT 
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TTGCTGAATT GAGAAAAAAA'^^AATCTAA 179760 

AAGTTGCAAA AGAATATGCC ATTAAAC TGG 179820 

TTGGCACAAC CCCAATGCAA AGAATACATA 179880 

AAATACTGGC ATCAGGAATT GAACTTAACA 17 9940 

GCCACAAAGA AGCTC TTATT AATACAGATA 180000 

CGACTGACAA TATAGATATA TTTGTAGTTC 180060 

ACCATTAAAG CTTATACTGT ATACTACTTA 180120 

GTCAGAGCGC CTGCCTTACA AGCAGGATGT 180180 

TAATCTAAGT CTCAATTACC TTTAGCTTTA 180240 

TGTTATTTAA TTCTTTTTGC ACTATATTAG 1803 00 

ATGCTCCTCC TTTACCAGCG CCACTTAACT 1803 60 

CACTTATCAG CCAATCAAGA GTATCATTAG 180420 

CAATATTCAT TTCATTAGCT AAAGAATACA 180480 

TGCTTACGGC AAGGCCAAGC TTTTCAATAA 180540 

CTTTTTTCAA ATTAACAACT ATTTCTTTAG 180600 

GAAAATAAAA ACCAGAATCT TTTATTTTCT 1806 60 

AATAAAAAGT TCCATTAAGA TCGATTAGTC 180720 

AAATGTTTTC AATTTGATTT GCCAACAAAA 180780 

TTGTAATATA TTCTGCAAAG CATAAACTAA 180840 

TAGGAATTTC AGAAATTATA TCAAACTCAA 180900 

TAAAACTTAT AAGGCTATTT AATCTTGTAC 180960 

ATACACTATA AATCAGATCC ATATAAATTG 181020 

AAACAGCGCT ATGTTCGCCT AAGAACAATA 181080 

ATCGCTTTCA AACTTAATGC CTTCATTCTC 181140 

AAAAGCTTCA ATATTGGGCC TGTAAACTAA 181200 

AATCAAATCA CATTGACCTA AAAGGTGATC 1812 60 

TATTGCCTCT CCAATTGCCA ATCCCAATTC 18132 0 

GGATTTGGAA TTGCTAGCAT TTAAAACAAG 1813 80 

AAAATCTAAA ATAGAATTTC TATGTTTATT 181440 

TGCTTGCAAA CCCTGCATCA AATAAAAATC 181500 
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ATTAAACTCT 
AACACCCCCA 
CCTGTAAGCT 
GTGAATTAAA 
CTTTTTAGTT 
ATTC TCTAGA 
GCTTCTATTT 
AGAATCGCTC 
TGCCTTTTCC 
ACTAATCAAA 
AAATTTTGCT 
ACTTGGGGGC 
GCATACCTAA 
AAGGCAAACA 
TTTTAAAAAA 
ATAATAAAAT 
ATTACTATCA 
TTGAAAAGAT 
CGCAGAAGCC 
ATATTTCAAA 
TGCTGTCGGG 
TTTTCTTGCA 
TAAAATTATT 
ACTAACAGCA 
TTTAATTAAA 
TCTTCAAAAT 
ATAATTTGTC 
TATTAACATG 
CTGCAAAATT 
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ACAGCACCCA 
AAAATACTAG 
TCCAAACAAT 
AAAAGCCC AC 
CCATCATTAA 
TTAAAAAAAC 
TCTATTAAAG 
TTTTTAAAAG 
TCTAAAATAG 
TCCATTCTAA 
TAAGTCCTTT 
CTGCATCCAT 
TTAAATCTAT 
TATTCTGATA 
AAATATAAAG 
TTATGCCGTT 
ATTATGGCAA 
TCTTTTGAAC 
GATCCTACTC 
ATACAAGCAG 
AAATTGTTTT 
TAATCAAAAA 
TCATCTCGAT 
AGACTGGAAG 
GCTAAGCTTG 
TTTAAAAGCA 
TTTTTCAAAA 
CAATCTCATA 
ACTAGCAAGT 



GCTGCCTGCA 
TAGCAATATC 
ATTTAAAAAT 
ACACTATGCC 
AGAAAAAATT 
AATTTTGACT 
AAAAATCGTC 
AAAAAAATGC 
TATACTCCCC 
GTCACACCCA 
TAAAATAGTA 
CGTCTCAAAT 
TGTACTATTT 
ACTTTTTACA 
CGTCTTTAAA 
TGCAAATATT 
ATATTATTCG 
CCTCTTTTAA 
TTGCAAGATT 
CAATAGAAGC 
TACTTTTAAT 
ACACCTTTTC 
TTGAAAGTTC 
TAGCTGGAAT 
CATGAACTTT 
AAATCAAAAG 
TCAGAAATAT 
TGACCCTTTT 
CCAACACAAG 
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TTTAGGATTA 
ATATCCACTG 
TTCACCTTTC 
AATAGCAACG 
ACTTGTATCA 
CAAGTAAGCA 
TATTTTTTTC 
TCTCTTGTTG 
CATTAAAAGT 
ACCTTTGAAA 
TTTAAATTTT 
ACAAAAATCC 
TTAAAATAAA 
ATAGTTGCTC 
AATCTTTTTA 
CATTGCGGCT 
CAAATCATTA 
AATAGTAAAC 
AGATGCGCTA 
AAAACCTGAA 
TTTAAATCTA 
TCTATTTTTT 
AAGCTCACTT 
GTTTAAAAAA 
ACACTTTATT 
AATAAATATT 
TATATTTTGT 
GAATCCCATT 
AGAGAATACC 




AAGCCGCCTT 
CCTATTCCTC 
TCAACAACAT 
ACAGCACTTG 
ATATATACAT 
AACATTTTAA 
TTTTTACTAA 
ATGGCAATTG 
AAATTTCCGG 
CAATAAAATC 
CCTCCAAACA 
CCTCATTTCT 
AAATAGAAGA 
C AAAATGTAT 
CTAGAGGCAA 
CTTGAAGACA 
AAATAAGATT 
CCCCCGTAAA 
TTACAAGAAT 
CTTGAACTTG 
ACATTTGGTT 
AATATAACTG 
ATTGAATAAA 
ACATCCTTTT 
TTCATTCTCT 
CATTCTTTCC 
CTTTAAAAGA 
AAATGCAAGA 
AATAAATTCG 



CAAACTCAAT 
CTTGAGAATA 
TGGTAGCATT 
AACCAAATCC 
CATACGCAAA 
AAACAAAATC 
AAAAGCGCCA 
CCAACCCCAA 
GTACAGAAAA 
AATGCCAGTA 
AAGAAACTTT 
CAAATCAGCA 
TGCAAACATT 
AAAATCCTTT 
TCCAAGCATC 
ATTCTTTTTC 
GATCTCTTAA 
TAGCCCTTGC 
ATTTATTAAA 
CAAGGCCTGC 
CATTAAGAAT 
GCTTTGAATT 
ACTTGTCAAC 
TCCCCCAATA 
AACCTTATTT 
ATTTCCAATA 

TGGAGTATTT 

v." 

GCCCTTAATG : ' 
CTCTTACTAT 



181560 
181620 
181680 
181740 
181800 
181860 
181920 
181980 
182040 
182100 
182160 
182220 
182280 
182340 
182400 
182460 
182520 
182580 
182640 
182700 
182760 
182820 
182880 
182940 
183000 
183060 
183120 
183180 
183240 
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TTACATTCAT 
TAGTTCCAAC 
ATTTACTAAG 
GAGCTCTTGT 
TATTATTAGT 
TTTTAGCCAA 
AACGGGCTTT 
ATCCGAATTC 
CACCCATAGC 
TGTGCCTAGT 
GTTCAATCCA 
CCGATTTTAT 
CATTTTCAAG 
CAATTGGCAA 
AAGATAAATA 
TATAAAAAAA 
AAACGCTTTT 
TCATCAAGCA 
CTTCCACTTA 
ACAGCATCTT 
CTGGCTCCAA 
AAAATATTAG 
CAATCAGAAA 
CAATTAGTTC 
TTAACGTCTT 
AAGTCTGACA 
TTCATCAATT 
ATCATTTCAG 
AACAAAGGAA 
TTAAACAAAA 
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AATTTTAAAA "WTAAAATTG AAGCTTCATT 
TTGCAAAGGA ATTTCAATTT CTCCAACTAA 
GGGAAAATAT TTGCCGCTTT TTGAAGCAAA 
GTCATTAAAA GTTGCAAGAC ACACCCCTGT 
AACAGCTCGC TCCTCTTCGT AAAAACCTAT 
ATTCCAAGAA TCCTCTTTAC CCGGTAGCAA 
GGCTGTAAAT TC AC TAATAT CATTGCTTAA 
TAAAAAAATA AATTCTGCTA CACGCTCTGC 
ATCACAAGTA TCCACATAAA TATTTAATTT 
TGACAACCTT CTAAATCCAC CCCCCCTTTG 
AGTCTTAATT TTATGACCAA GGTC AAC AAA 
ATAAATTTGC GAAATTCCCA ACACTTCACC 
AATCTTTGCC GCAAAATTTA GGGCAGCAAC 
AGAATAGTAT TTGCCATTTA TTTTCAAATT 
TCCAATATAA TTTTCTATCA TATTAAAAAG 
ATCTTTATAA GATAATTCCA AAAAACTTTT 
ATGTCTAAAA TTTTTACTAA GTTCCATAAA 
AATAACTACT TAAAAAATAC TTATTATTTC 
AAAACATAGA CATTTTTAAA ATATGTTCAT 
CTCCTGAATC ATAAAAAGCC CTAAGAACAA 
GGGCAATGCC TTTAGCAATA TCCATGCCCG 
CCTTTAGAGA ATCATCAATA CTAAGTAAAG 
AACAAGATGC AATGTTTAGA TTATTACTCT 
CACCACTCCC TGCAAGATCA ACATAAGAAG 
TTGGCGAAAT TCCAAAACCT GTCTCTTTAA 
ATTTGGC TAT TGACTCTCTT ATTCCTTTAA 
CTTGTCCTGC ATTAAGATGA ACAATAATTG 
CTATTTTAGA AATACCAAAT TCAACAATCT 
TATTATGAGC ATACCTTTTA AGAGTAAAGT 
GCTTAAAAGA ACCTAGCCCT ATAGGAATTT 
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AAAAGATATA ^!R:CCACCTT 183300 

AGCATTGTCA GTAGTATAAA 183360 

TTTATGAACA GAGGCCTCAA 183420 

AATTCCATTC ATAATACCTT 183480 

ACTAGAAATA AGTTCAATTT 183540 

ATGCTTAAAA TCTAAAACAA 183 600 

AACTTTTAAA ACACACTCAT 183 660 

AATTGAGTTT AGCAAATTAG 183720 

TTGAATACCT AATTCTTTAA 183780 

ATTCATATTG GTTAAAAGAG 183 840 

AATTTTACTT AAATCTTTTT 183 900 

CAAAGAATAC CTTAAATCAG 183 960 

AACAGAAGAT TCTTCTGTTG 184020- 

TTTTACAATT CCAATAGGAA 184080 

AAAATCTTCA TTGGCATTAT 184140 

TATCTCTTGC CTTTTTTCTA 184200 

ACTGCTTAAA GACTCCAAGT 184260 

TAAACTCTAA TAAACTTTTG 184320 

AATCAGAAAA AAGACCAAAT 1843 80 

CTGCTGCAAC ACCTATAAGC 184440 

TCTCATATCC ACCAGATGCA 184500 

TAAAAACCGA AGGTATACCC 184560 

TCATGCCTTC TACTAAAATC 184 620 

CACCAAGGCT GAACAATTCC 184680 

CAATCAATGG AACACTTAAA 184740 

AATTTCTATC TCCATCAACC 184800 

CATCAACTTC *• TAATCTTTTA 184860 

GAACAGCACC AACATTGGCA 184920 

CTCTTATGTA CTCGGGATAC 184980 

TTAAATAATT TGCAATTCTA 185040 
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ACTAAAGATT TATTAAAGTC ATTCCCCTCT TTACTGCCCC CTGTCATGGA AGAAATAAAA 18 5100 

ACAGGCATAC TAATATTGTA TCCAAATATC TCTTCTTTTA TGTTTATCTC GGAAAAATTA 185160 

AAATCACTAA GAGCATTGTG TTTTAGCTTA ATAAACTTTA AGAAATTACA GCCACCTTTA 18 5220 

ACATCGTTTT TATTTAAACA AATCTCAATA TGCCTTTTTT TATTTTCTAA TATATTAGGC 1852 80 

TCGATACCCA TAAACTCGGT ATCCATCATT CCTTAGTTCT TTTAAATAAA ATCCTCTGGA 185340 

TTCACCAGGA ATTATTTTAT TTTGAAAAAA ATCTTTATAT TCTTCAAAAT TTGCATTATT 185400 

TCTATTTTTT ATAAGACCTT CAAGATCCCA TAATTTAATA ACATCAAAAG CACTCTTTTC 185460 

GATGGTAAGT TCATAAATAA TCATAATATT GCCAGATCCA TAAGAGCAAA AC AATATC TT 185520 

TTCCCCTGTA ATATCTTTCT TGGAAAATAC TCTTTTTAAA TAAAATGCTA AAGATAGAAA 185580 

AATTGAACCT GTATACAAAT TTCCCACTTC CATAGCAGCT TCAACTCCAT CGTAAAAATC 185640 

TATTGATTCT AAATAAGCAT TTCTAACAGA TTCGTCATCG CTATAATATT TTTTCAAAAT 18 5700 

ATAATGCATT GAATCTATTG GCATTTTAGC AAAAGGAACA TGCAAAAGAA ACCTATAATT 185760 

AGAAAATAAA TCTTTCATAC TAAGTTGCTT TTTGAAAGCA AAATCTCTTA AAGCATTTTC 185 82 0 

GTTTGCATTA TTGTAACATT CAACTGAATA CTGACCTCGC ACCTTAGCCT CAACACTTCC 185880 

AAAAGGCCTA AAAAAATCGT CAACATCATC AGTATAAACT CCAAATTCAG ATAAATTGAT 185940 

CGAAAGTAGC TTTGGATTTT TTTCAATCAA AATTGCAGTT GCGCCGGCTC CTTGGGTAAT 18 6000 

CTCAGCCGTA GTAAGATTGC TATAATGTGC AATATCTGAA GAAAAAACTA TGCCGTATTC 18 6060 

AGAATTATTG GAATGGCTTA AAACACTTGC CACAGTGTGC AAAGACATAG CAGCACCAGC 18612 0 

ACACGCATGT TGAACCTGGA AAGTTAGAAA ATTATTTCCC AGACAAATAC CAGATTGCTT 186180 

TAAGGCTC C A AAAACATAAG AAGAAATGGC CTTTGAATGA TCAACGCCTG TTTCAGTTCC 186240 

ACCCAAAAGT ATTCTAATTT TGCTTAAATC AAGATTATTG TTGTCAAAAA TAAGCTTAAC 18 6300 

AGCCGAACTT GCCATGGTTA CACTATCCTC ATTAGGACTG GTAAACCTAA AACCTTTTTG 18 6360 

CAAGGTTGCA TCTATTGCTC TATTGATTTT. TTTAAAAAAA ACTTCATTAG AAAAATATAA 18642 0 

AGGATTTTCC AAAAGAACAG AAAAATCTAA ATAATTTAAA GGTAAAAAAA TTCTAATATC 18 6480 

ACTAATACCT ATTCTCATAT ACTCCTCAAT GAATTAATGG CCTTAAGTAT TATATTATAA 186540 

TTTACAAAAA TTAGCAAAAT C TT AT AT AAT AAAACCTAAA AATGGAAGTT TATGAAAATA 18 6600 

GCCGTGCTTT TATCTGGAGG AGTCGACAGT TCTGTTGC.CC TTTATAGAAT TATAAACAAA 186660 

GGATATTCAA ATATAAAATG CTACTATTTA AAAATCTGGG TTGAAGATGA ACTGTCTTAT 186720 

ATTGGAAACT GCCCTTGGCA AGAAGATTTA AATTATGTTG AAGCTATATG CAACAAATTT 18 6780 
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AATGTACCGT 
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AACAAAAAAG 
TGCCTGGGGG 
GAAGATTTTT 
ATCTTTTACA 
TGTAGTTTGG 
ACTTTGCTTA 
AGGAACTATT 
AATATCAATA 
AGCTACCACT 
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AGTATTAACA 
TGAAATTTGC 
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ATTTTCAATA 
GTACAAACTG 
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ATGAAATAAT "T^ACTTTCAA AAAGAATATT 
AACTTAAAAA TGGCAATACC CCAAGTCCAG 
GAGCATTTTT TGAGAAAATC AATAGCCAAT 
AAATACAAAT AAAAGAAAGT AAATTTTTAT 
AAAGCTACTT TTTATCTCAT CTCTCTCAAA 
GCACATTACT TAAAAGCGAA GTAAGACAAA 
ATAGAAAAGA TAGTCAGGGT ATTTGCTTTT 
AATACCATCT TGGAGAGAAA AAGGGAAATA 
GAATTCACAA CGGATATTGG TTTTTTACAG 
ACGGGCCATG GTTCGTCATA GAAAAAGATC 
ACGAGAATTA TTTAAAACAA GCAAAACGCA 
ACGACACACC TACGAACTTT GAAAATTTCA 
ACTCATGCAA ATTAAAACTT ATTACAAATA 
ATCAAGGAAT CTCCCCAGGA CAATTTGCAA 
GTGCTAAAAT TTTTAAAATC ATAGAATAAT 
CAATCTTCTA CTTACTTTTC GATCTTAAAA 
CTCTCTAACA TCTTTTCAGA CATTGCAGAA 
ACTAACTGAC TAACCTGCTC TATTGCATTT 
TAACTTTCAT TAGAAATATT TTTTACAAGT 
TGTTCAAAAT TTTCCCCAGC ACGACTTGCA 
ATCTCTCTTG CTGATTCTTT GCTTTGATCT 
TCAAATCCCT TGCCCTTTTC TCCCACTCGT 
AAATTGGTTT GCCTTGTTAT CTCATCAATA 
GCC TCAATAG CCTTAACAAC AGATTTATGC 
GCAATTTTTT CAGTAGTAGC TGCATTTTCA 
TCAATATTTG CTGTCATTTG CTCTAAAGTA 
TTC TGGCTTG CATTTGCTAT TTGAATTGCA 
ACTCCTTTTG CAACTGAAGA AAAATTGGTT 
TAAAGCTCTA CAGTATCCCA TTTGCCAAAA 
AGTCTCTCAG AATATTCCAG TATCTTATTC 
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ATAACAAAGT^GTAAGCTAT 186840 

ATATTTTTTG CAATCAAAGG 186900 

ATGATTTGGT TGTAACGGGA 186960 

TAAAACAGGC AAAAGATAAA 187020 

AACAAATGTC AAAACTATAC 187080 

TAGCTAAAAA CATAAATTTA 187140 

TAGGAAAAAT TAAATATAAC 1872 00 

TAATTGAAAA AGAAACGGGA 1872 60 

TTGGACAAAG AAGAGGAATA 187320 

TGGAAAAAAA TATTATATAC 1873 80 

AATTTTTAGT TCACGAAATA 187440 

AAATTAAAAT AAGACATGGC 187500 

ACTTAATGGA AATTTCTTTA 187560 

TTTTTTATAA AAACACAGAA 187620 

AATCCGCCCA AAAAGTTAGA 187680 

TAATCAACAG ATTCTTTTAA 187740 

AGCTCTTCAC TGCTTGAGGC 187800 

TTAAATTGCT CTATTTGAAC 187860 

CTGGCTGTTT GTTCCATACC 187920 

ACAGTTAAAC TTCTGTTTGC 187980 

GCAAGCTTTC TAACCTCAGC 188040 

GCAGCTTCAA TCGAGGCATT 188100 

ATTCCAATTT TTTCAGTAAT 188160 

CCCTCTTTAG TCCTTTCATT 188220 

GTATTCTCAG AAACACCTTG 188280 

GAAGCCTGCT CAACAGCGCC 188340 

TTTTCATAAA GATAATCTAG 188400 

CTCAACTGCT CAAGCCCTTC 188460 

TTAATATCAG CAGTAAAATT 18852 0 

AAAGAAGAGC TTAACTTTTT 188580 
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CACAAGATAA 
GATTATAGTT 
AATAAATTTA 
ATTGCTTTTA 
TTCAGATGTT 
TGGCAATACT 
GC TAG AG AGT 
AAGAAACATA 
TAATATGTAA 
CTCCTCAATA 
CTCCTTAAGT 
CTTAACCGCC 
GGTTTGAATA 
ACTAGCCTCA 
TTCCATTGAA 
AAGATTAAAC 
AATTGATATA 
CATACTAACC 
ATTTAAATCC 
TCTTACAGAA 
AGTAGAAATT 
ATCTATTAAC 
TTTGAAAAAT 
ACTAGAAAGC 
AGCCATTTTA 
AACAAGATTA 
AGTTGCTACC 
CAAATCAGCC 
TGCTCTTGCA 
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AGAGTTGCAA 
GTAGCTCGTG 
TTATTAGACA 
GGATCATAAT 
TTCTTAATAA 
ACATGATGAA 
ATTCCAAAAT 
AAATAACCAA 
GGAATCTGAC 
GACCCGGGAT 
TTTGTAAAAT 
GTAGTAAACA 
AGAATTTCAG 
TTAAATTTTT 
GAGATGTAAA 
TGTTGATCCA 
AAGAATGCTA 
ACCTCTTTTA 
TTATCAGCAA 
GATTCGCTCT 
CCATTGCTTT 
CTGAAATCAT 
AATACAGATT 
TGCTCACTGC 
AATTGAGCAA 
GCCGTTTCTT 
TTAGAGTTAT 
AACTTTCTAA 
GCTTCAATAG 



TAGCAAGCAT 
ACATGTAATA 
AGTTTAATAA 
AATAAACAGT 
TATTGGAATA 
CCAACAATCT 
TAATCCCTCT 
AAATGTTATC 
CTTTTTTATT 
CTGCCAACAT 
ATTCTCTATC 
CTATTTTACC 
ATTGATCTGA 
CAGCAGAGTT 
GATTAAAAGA 
CCAATTTACT 
TCAGAATAAG 
CAAAAAATAA 
TATCGATAGC 
CAATATCTAA 
TCAAATTTTG 
AATCATCATT 
TTCTAAGTTC 
TTGAAGCTGA 
TCTGATCGCT 
GAATTTCGGG 
CTTCAACTAA 
TCTCACTGGC 
CCGCATTCAA 
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AAGTAATGTA AATACAAAAC TAATTGCTAA 188640 

AAAATCGTCC TCTGAAGTTC TCATTAAAAG 188700 

CACCTTTTGA CTAATTCCCA CATATTTCTT 1887 60 

TGAAATTTCT TTATTCTTTT GCAACAAATC 188820 

AGAAGCACTA ATATCAGTCA AAATATCACC 188880 

ACCCGTAGTA TCATAAGCTA GCGCACGACC 188940 

AAAAGACCTA TATATATAAT CCATTGAATA 189000 

TGTTTCAAAA TCTCTTAATG GCATACCTAT 189060 

TTTTATCTTG GAAATATCTT TAAGTAAAGA 189120 

GACAAAGGAA TTGTAAACAA TACTATTAGA 189180 

CCCAATAGAT TTGCCAAAAT CACTATTATC 189240 

TTCTTTGTCT GTAACCATCA TATTACTACC 189300 

AATAAAGCTC AACCTTTTGG ATTTAATTTT 1893 60 

GAAATACATG GAACTAACCC TAACTTTCTC 189420 

ACTTTTAATG CTTTCAATAA GATTTATCAT 189480 

ATTAATAAGC ATTCCAAAAG CAAAAAACAA 189540 

AACAAGTAGC AACATCCTAG CTTTAAGCTT 189600 

ATTCTAAAAC TCTGAAAAAT CATC ATC AAA 189660 

TTTTTTAGGA TCAACTCGCT TATTAATAGT 189720 

AGAATAATTA TTATGCCCAC TGGCATTTGA 189780 

ATTTTCATCT TTAAAAGAAT TTTCAGGACA 189840 

TTC TGGATTT TCAATTTTAG AATCTTTAAT 189900 

CTTAGACTTT TCTAACATTT TATCGGACAT 189960 

AGATTGAACA ACTTCTCCAA CCTGATCTAA 19002 0 

TTGCTTAGAG CTACCTTCTG AAATCTTCTT 190080 

TAGCATTTCT TTAAAGATCA CTCCCGCTTC 190140 

CTCTCCAATC TCAAGAGCAG AAATTTTACT 190200 

CACAACAGCA AATCCCTTTC CCTCATCTCC 190260 

AGCAAGTAAA TTGGTTTTTC TAGCTATCTC 19032 0 
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TTCAATAACA CTAACTTTCT ^RCAATGTC TTGCATAGCA 
CCTACCACCT ATCTGAGAAT TTTCATTCGT CTTTAAAGCT 
ATTATTGGCG CTCATGTTGA CACCTGAGGC TATTTGCTCA 
AAGAGCAGAT GCCTGTTGCA ATGCACTAGA GCTTAAATTT 
TAAACTTGCC TTATTTACAT AGCTAATATT TCTCAAAACA 
AGCTTTTTTC ATTTTAACAA CCTGAAGACT TAACATGCCA 
ATCATCATCA AGAGCATAAT CTTTATCTAA ATTGCCCTTA 
TCTAATTGCG TTTAAACGAA AACTAATAAT CCTGTCTATT 
TAATGCTATA ATGCCTAAGA CTGAATATAA AATATACTGA 
TCCGTAAATA TCCTTATAAG GAAGCCTAGC AATAAGTACT 
ACTACTTATG GGCAACATTG CATAATAACA ATCTTCTCCC 
ATCAATAGTG TAAACCGACA CTTCACTGGC AATGTTTGAT 
AACATCTTTA AGAACATTCA AAAATTTAGA ACTAACCCTG 
AAAAGGATTA ACTGCTATAT TGTTGGGATC CACATAAATA 
ACCGAATCTA AATCTATCAA AACTATCTGC CACAATATCA 
ATACCCACAA ACAAGTTTAT CTTCTGGGGA ATATACAGGT 
TTTTTCGCTT TGTTTAGACC TAATAGCAAC TTCTGCGGAT 
ATACCAACCT ATAAATTTTA ATTGGTTTTG CCTATAATCC 
ATTGGTATTA GCCTCAGAAT GACCAAAATC CATATTATTC 
TACTCTCCCT TCAAAATCAA AAAAAGCGAA TTCTTCAAAA 
GGCCATAAAA TTGTATAAGT ATTGTCGATA TTTTTTGCTC 
AAATTTTGGA TTTTTTCTTA AATCTATCAA TTCCGACTCA 
CTCAGACATT GCAAATTCTG ATATGGTTTC AAGTGCCAAA 
TATGACATGC AGGGTGTCTA AAAAAGATTG CAAAGAAAAA 
CCTTGTAAGC TGCTTATAAT AATCTTCTAA ATAACCGCAT 
GGAAAAAAGT AGCAGTATAA AAATTAAAAA CAATAATAAA 
AAGCTTCAAT AAC ATAATAA ACTACCTCAC AAATCACCTA 
CAAGACC^AA GGGTATCAAA AAAAAATTTA CAGCAAAAAA 
AAAAGAATTT TCACAACATG AAAAACAAAT TATAAACTAT 
TTTTATAGGT TATTCCTAAG AACTTAGGCG CATAATTTTT 
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ATAAC AGATT 'WFCAACGGC 190380 

ATTTGTTCTG TTTCATAAGA 190440 

ACATTAGCTG ACATTTCTTC 190500 

TGACTTGAAC TGGCAACTTC 190560 

CTAGAAATTG CTACAGAAAT 190620 

AGTTCATCAA GAGTATTTTC 190680 

ACCATATCTT GAACTAGAAC 190740 

CTAATTGAAA GAACAATACT 190800 

AATCTTAGAC TAGATATTAC 1908 60 

CCACTCTTTT CTCCCAATTT 190920 

ATTTCGGACA AAAGTATTCT 190980 

GGAAAAGGGG GCTTAGAGAA 191040 

CTGGTTTCAT TATATTCTTC 191100 

AAATTGCCTC TTTTATAAAA 191160 

TTAAGCAAAT ATCCGGCCAA 191220 

ACAATTATTG CAAAAGCCTT 191280 

ATTCCTTCAG AAAGATTTGA 191340 

TCAACAGCTT TTTTAAAATA 191400 

TCATGTCTTG TGCTAACAAT 191460 

AGGGTATCAT TTTTAAGATT 191520 

ACCTTTACAG AGTCAATAAC 191580 

GAGAAATCTT TTCCTCTATT 191640 

TTAGAAGCTG CACCATTGAT 191700 

GCTGCTCTTC TTACTTGCGC 191760 

AAAACAAAAT TAAAAATCGT 191820 

AATCCAACAA ACCTGTATTT 191880 

CTTATTTAAT CAAATAAACT 191940 

TCAAAAACTC TCAAAAAAAT 192000 

TATTATTAAC TGC TAATGCT 192060 

TTTTGAAAAA AAGATAAGAC 192120 
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TTAGAATAAT 

AAAAGGGTAA 

CTATTAAAAC 
CTTGTCCACC 

CTGAAACCCC 
TAACTCCAAC 
TTTTAGTGTA 
AATATCTTTT 
GTATCGGCAA 
TAAAAACAGC 
ATTTTAAAAA 
CAGCAAAAAT 
CTCCTATTGT 
GACCCCCAAG 
ATATTATTAT 
GCAGAAATTA 
ATTAAACTTT 
ATTATGCCAA 
CCTTGCATAT 
GCAAGACCTG 
ATACCCATAC 
ATAGTTTTAT 
AAATGAAGAG 
CCTTCTGGTG 
TTTGAAAAAT 
TGAAATAATA 
TTTAAATATC 
GTAATTACAT 
GATCCTAGTA 
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AATTAAATAT 
TTGAGCCAAA 
TCCCAAAGGC 
TGTAACCCCT 
TGCTAAAAAA 
AGACTCTAAT 
TTTGAAAACA 
GCCAAAAATT 
CTTTATTTCT 
AATAGCGGGT 
AATTGTAAAA 
TGAAAACAAT 
CATTATTCCT 
ACCAGCTAAA 
ATTTGACACG 
CAAGAACAAT 
GAACTCTACT 
TTGGCGAATT 
AAGAAAGCTT 
CAACAGCTGC 
ATCTTGAAGC 
TAAGTAAAAA 
GTGCTTTTAA 
AGAGCTTCCA 
CACTATTATC 
TCCAATTAAA 
CGATTAAAAT 
GTAAAATTGG 
TAAACTGGCC 



GGGGTAATCA 
ACAATTGCCA 
GTCCATTTTC 
TGCACATAAC 
CCACTCAAAA 
ACCTCTGGAT 
ATATGAAACA 
TGAAAAATAA 
ATAGGCGGAG 
CCTAAAAAAT 
ACAGCGTGCA 
GGATCATTTG 
TCAAGTCCAA 
ATTAAGGTTT 
CTTAACACCT 
TATTCCCATC 
GCTTCCATAA 
GTTTCCCATA 
AAATATAGCT 
TGAGAGAAAC 
TTCAATATTA 
CCATATTAAA 
AAGCTCATTA 
AGAAGCTAAA 
TCTTTTAATA 
CATTATTCCT 
TCCTAAACTG 
AGGCAAATCA 
TTCAACCCCA 
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CTAACATTTT GGGAGGCATT ATTAAAGAAA 192180 

AAGTTTTCAC AAATGAAAAC AAAAAACTGC 192240 

CAAAAATCAA CATTACAATA GCAATAAAAC 1923 00 

TTGATGCAAC CACCGTTGTA AGAACAGCAC 1923 60 

GAACGCAAAA AAATCTAATT TTATTTACAC 19242 0 

TTTC AC C ACT TGCATTAATT CTAAGCCCAA 192480 

AAACCACACT TAGGATTGCA ATGTATACAG 192540 

AAGATGTTTT GTTTAAAATT CCATCAAAAA 192 600 

TTGAAATAGA AGAAAAAATC AAAGTGCTTA 192 660 

TAAGTGCCAT TCCGGTTATA ATTTGATCTG 192720 

AAATAGCAAG AACAAGCCCT GCTAGCCCAC 1927 80 

TAAAATATGC AACTGTAGCT CCTGAAAATG 192840 

TATTAATAAT TCCACTTTTC TCGCTTATAA 192900 

GAGAATTTAT TAAAGTTTCA CTAATCAAGA 192960 

TTTAAAACAA TTTTATTTAA AAAATAGCTA 193020 

ATCAAAGATA CAATTGAAGA TGGAAGGCCC 193080 

AGCAATATAG AAAAAAGAAT GCTAGAAAAT 193140 

AGAGAAGCAG CTATCCCATT AAAACCAATT 193200 

TTATTAACAC CCATAAGTTG AATAGCACCA 1932 60 

ATTGAAAAAA TTAGCACAGC TTTTACATTA 193320 

CTTCCTGTGG CATTTATTTT AAATCCAATA 193380 

ATAGCAAAAA TTATACCTAA AATTATTCCA 193440 

ACAAAAGGAT GAGAAGATCT ATAAGCAAGA 193500 

AAATCAATAT ATGCGCTTTC TTTAATGGGT 193560 

AAACTAAAAT CTAAAATTAT ATTATTTAAA 193 620 

GAAATCACTT CGCTAATATT GAATTTGGCT 193 680 

CCTGAAGCTA AAAAAGTAAT AATAAAAATA 193740 

AGTAAAACTG ATGCTATTAA AGCAACAATA 1938 00 

ATATTAAAAA GACCCGCTTT TAAAGAAATA 1938 60 
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CCAATAGAAA 
TTGGGAGAAG 
AGCCCCATCA 
AATGCTGAAG 
ATTTAGCTTA 
ATGCCCACTA 
TCAAGTTCAA 
TTGTAAATAT 
ATATCTGGCT 
AAATGCTTTA 
TTGCTCAATT 
AATTGTCTTT 
ATTGTTTTTA 
ACATTGAATT 
TATCTATTCT 
TTTTCAAAAT 
CCTGACCACT 
TAACGTCTTT 
GAATATTAAA 
TCATTAATTT 
TCCCAAGACG 
TAATAAGTAT 
CAACTTCACT 
GATAAAGAAC 
TTTTTTCTAA 
CTTGTTTGTA 
CAGCAGTAAA 
TAGTTCGCTT 
TACTTGATGA 
AAGAATAGCA 



GACCTGTAAA 
AAAAAATAAT 
CCACTAGCCC 
AATTTAAAAA 
AGCCTATCAT 
TTCTTCCACC 
GAGAAACCAA 
TCTCAACAGC 
CTAAACTAAT 
CCTTGCTTAA 
TTC TTAAAAT 
TAATAAAATT 
ATCTCAAATA 
CTAAAATAAG 
TTGTTTAATA 
ATCGCCCTTA 
CCCCTCAATA 
AAC TTTTAAA 
ATGATTTTCA 
TGTAAGATCT 
CATAATTGTA 
TACAGTATGA 
TGGAGCAAGC 
TTTCAATATC 
ATCTATCTTT 
ATCAAGAAAA 
TTGCGGAATT 
GAATCCTTAA 
ATCCCATAAA 
TGAACTTCGC 
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TCAAAGGA 
TTCTAATATT 
AACAATTAAA 
TTTCAAAATA 
CATTTTACCA 
ATGGATTACA 
TAAAACAGAT 
TCCAACATCA 
CTCACGAGCA 
AATATCTCTT 
ATTAAGATCA 
GAAAAAATTG 
ATCAGGATTA 
GCCGTGTTTT 
GTTAAACCTT 
AATATGCTTT 
CCTGATATTC 
ACTCCTCTCT 
AATTTAATTT 
TTGTCATCAA 
CATTTC^TTG 
CCCTCTTGGG 
ACTGCTGTAG 
TCTATTTTTT 
AAACCATACT 
CCAAATTTTG 
AACATAAAGT 
AGTTTATTTC 
TAGTTTTCAT 
CTGCCTTAAA 
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GCTGAATAAC TTAAAACAT^RCTAAATGT 193 920 

ATAAAATACA TTCTAAAAGG AGAATGACCA 193980 

AATCCAACAA ATAGAGCAAA TACACTAACA 194040 

AATTTACTAA ATACGTTTTT ACTAATTGTC 194100 

ATAACATCAA TATCAAAATT ATCCTCTAAA 194160 

GCTATCCTGT CACAAACATT AACAAGCTCA 194220 

CTACCCGCAT CTCTTTGCTC TATTATTCTT 194280 

AGGCCTCTTG TAGGCTGAAT AGCCAAAAGA 194340 

ACAATAACTT TTTGTTGATT ACCTCCGGAT 194400 

GGTCTAATAT C AAAATG AC T TACAAGTTGA 194460 

AACCCCACAA ATTGTTTTTT AAACTTATTG 194520 

AATTTTAAAT CAAAATTACT CTTCAAATGA 194580 

TCAAAGCTCT TAAGTCCAAT ATTTTGCATA 194640 

TGGCTGTCCG AAGAATATTG CCAATTTTTT 194700 

TTAAAGATTC CAAATTCCCC GAAGAATTTT 194760 

TCAAACCCAA AATTGCATCA ACTAAATCCT 194 820 

CAAGAATTTC TCCATTTCTC AGATCAAGAT 194880 

CATCTTTAAC ACTTAAATTC TTTATTTCAA 194940 

TAGATGAGCG AAGTGCAACT TCTTTTCCTA 195000 

TATCAGCAAT ATTAACAGTT TTTACAACCT 195060 

CAATAGATCT AATTTCTTTT ATTTTATGGG 195120 

CGAGTACCTT TAAAATATTT ATAAAATCAT 195180 

GTTCATCAAA AATAATAATA TCTGCATTTC 195240 

GTTCCATGCC AACACTCAAG TCTTCAACCC 195300 

TTTCCGAAAG AGAACTTATC TTTTTTCTAG 1953 60 

AATTTTCATA TCCCAAAATA ATGTTTTGAA 195420. 

GTTGGAAAAC CATTCCAATC CCATTTCGAA 195480 

TTGACCTTTT AAAATAATTC GACCACTATT 195540 

TAAGGTAGTC TTTCCAGCAC CATTTTCTCC 195600 

TTTAATAGAA ACATTATCAT TGGCAACAAA 195660 
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ATCACCATAC 
ATCCTAATAT 
AAAGCTTAAA 
TATAAAAATT 
GCAAATTAAT 
TCAAAGCCTG 
TCTCAATGAA 
ATGCCCTTGT 
TTGAACATGA 
ATAAAAACCT 
TGAATAAGCT 
ATTTGGTGAA 
ATTTTCCACC 
AAAAAGAAGA 
CTGGATTTAT 
GTGGAGCAAA 
AAAACCATAT 
ATCTGCCATA 
TGCTACATTT 
GGATAAAACA 
TAACATTGTA 
TAACTTTGTC 
AAGCAATGGA 
AATATACTTA 
TATAAAGATA 
AGAGCTTACT 
TCTAGACGAA 
AACAATGTCT 
ATACAAAAAC 
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TTTTTTGTAA 
AAATATCAAA 
ATTTATATTT 
AATTGCTAAT 
GTTAAAATAA 
CATTTTTGAT 
TTCTGCTCTT 
TGGCAGAGGA 
TAATCCTAAT 
TTCATTCCAA 
TAACATTCCA 
AAATATTTTT 
AAAGCCAGAT 
AATTGCATAT 
GCCAATAGGG 
ACATATAATA 
TTTCCTTATT 
GAAAAAGAAA 
ATCAAAAATG 
GATATAATGA 
AAGCCTCTTT 
AAAGTCAACT 
TACATTACTT 
ATACAAAAGG 
CTACTTCTCT 
ACTAAAAACA 
AAGACAGCTA 
CCAATTTCAA 
AATCCGGTTA 



TATTTTCTAA 
CATAAATGAC 
TCAATAGCAT 
AAATAAAAAT 
AAGAATAAAT 
ATGGATGGAA 
TCAAACTTAG 
TTTAACAAGT 
TTACAAGAAA 
ACAAAACCAT 
ATTGGAATTT 
GGAAATATAT 
CCTGAAAATG 
ATTGGAGACA 
GTTTCTTGGG 
CACAATCCAC 
TATATCATTA 
TTGAAATACT 
ATTTTGAATC 
TTACCCCATT 
TTAGTAAAGA 
CATCTTTAAG 
TTTATATAAA 
AAAATAAAGC 
TAAAAGTATT 
TTGAATCCAC 
AGCTTATAAT 
CATTAATTGC 
AAGGTTTTAT 
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TACTAGTACA 
TTACAATATT 
TTTAAATTTT 
GCTTAAAATA 
AAATTATTAC 
CACTGGTAAA 
GATACAGTAA 
TTGTAATAGA 
AACTTTACAA 
ATGAAAATAT 
TAAGCAATAA 
TGTTTTTTGA 
CCCTTGATAT 
GCGATGTGGA 
GATTTAGAAG 
TTGAACTATT 
TCTATTCAAT 
CAACTATTTA 
AGAAATAAAA 
TGTTTTATCT 
ATTGACAAAA 
AAAAGAATTC 
CAAACTATTT 
ACTTTATTCA 
GGTAATTAAA 
TTCAAAAGCA 
TGAAAGCTTT 
CATTTTTTCA 
TGGGTATGAT 



TCTTCTTTCA 
GTATTCTAAT 
AATAAACAAA 
TAAAAACCTT 
AAAAGAGAGT 
TAGTATAATG 
AATAGAACTA 
CACTCTAAAG 
AGAATTTGTT 
AAAACCCCTT 
GAACCACGAA 
AATCAGGGGT 
GATATTAGAA 
TATGCTAACC 
TGTTCAAGAA 
GGACCTAATA 
CATGAAAGTA 
AAAGAAAACA 
GACCTAATTC 
GGCATTGAAG 
AACGACTTAA 
TTTTATAATT 
GAAGGAAAAA 
TCAGACATCA 
TACTGCTTTG 
ATAAGCAATG 
TTCAAATATG 
GCCAGAGCAA 
GAAAGTTGGT 



TCAAGCTTTA 
GAAATATAAT 
ATAAATTATC 
AATAAAGAGA 
ATTATGAAAA 
GATATTGCAT 
AGCAAATTCA 
CTATTATcTC 
AAAGAATACA 
TTAGAAACTA 
GAATTAATAA 
TATTCAAAAA 
TTAAATGCCC 
GCACTAAACG 
TTAAAAGAAA 
AAATGAATAC 
TAAAAAGTTT 
AAAAAACAAT 
AATACGTCAA 
CTATTGATTT 
ATTTGATATT 
TTAATACCAT 
ACTCTTATAC 
TAAAAAATTA 
AAAAAGGAAT 
ATACCGACTT 
AGACCTTACA 
GAACTCCAAA 
TTTCAATAAA 



195720 
195780 
195840 
195900 
195960 
196020 
196080 
196140 
196200 
196260 
196320 
196380 
196440 
196500 
196560 
196620 
196680 
196740 
196800 
196860 
196920 
196980 
197040 
197100 
197160 
197220 
197280 
197340 
197400 
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ACAGTCGGGC 
GGTAAATAAA 
TTTGAACTTT 
GCACTAATAT 
ATAACCAAGG 
GATATAATCC 
ATTGTCGCCC 
TTAAGACTTC 
AACCTAATAC 
TTTTTTATAA 
AATATCATTA 
TTCATCTTAA 
GTTTACTACA 
GATAAATATT 
GCATCATTTA 
CCTATTATTT 
CCTTTTACTT 
CTAAAAGAAA 
AAAAAAATAA 
TTGCATTGTT 
CAGATGAAGA 
CCAATTTTCC 
AAAGCATAAT 
ATAAATATCT 
AAATCAAAAA 
TTTACAAATT 
ACATTACTTA 
GGGAATGAAA 
TACAGACATT 
TGCTCCTTTT 
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TCTAGAGAAT ^fGATTCAAG AATAATTAAA 
TGGTAAAAAA ATTTTCAATT TTCTTAAAAG 
TAATCGAAGA ACTCTCAATA ATTCTTTTTT 
TTCTTGGGTT TCTATTTGAC ACAATTTTTA 
CCTACCTTTC CCAAAGATTA GAAATCTACG 
ACTGCCTTAT TCCTTTAGCG TTTTATAGCT 
ATGAAACAAT ATTAAATCCA ATAATGCTAT 
TTAGGTTTAA TGACCTAATA ATAGAAATAT 
TAATAGCATT TGCTAGGACA TTTTCAATGA 
TAATATCAAG CTCAAAAATT GTAAATTCAA 
AAAATATATC AATAATAAAT GAAAAAGCTT 
TAATCAAGGA AAAAGATGAC ATAATATACT 
GTCCCAGTGA ATATAGAGTA ATAGAAATGG 
TGCAAAGAAA AAGCGATTCT ATTCTTGGAA 
CTATTTTTTT AATGAATTTT TATAAATTTT 
TAATGACAAA AATTTTACAA GACCCATTAG 
TAAGCGAAGA AAAAGTATAT GAACTTGCAA 
AACTAAACTC AAAGCGAAAA AGCAAAATAC 
TTAATAAAAA CCAGGAAATA AAATGAAAAT 
AGATTTTCCA CTTAATGCCA GACTTTTGGA 
AATAAAAAAA TATTCGTCTT ATAATTTAAT 
AACAAGCGAA ATAGAAAAAA ATATTTATAA 
GCTCAATAAA ACTAACTACA GC TTATT AAA 
AATTCAAAGC GAACTCATTG ATAAAAAATT 
TATAAATGGA ATTTTTAAAA GCCATTCACT 
AGAACTTTAC ATAGAAAATA ATGCAGAACC 
TTTTTTAAAG AATTTAGATA AAATAAGTAA 
ATAATAAAAT TAAAGCTTGA ACTTTTTTTA 
ACTCTTTGAA GAACCTTTGC CCTATCTAAT 
ATTAAGCAAT CCTTAACTAA TTGTTCCTTG 
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GAATTATC AG^RATAGCCAA 197460 

CAATAATAAT TTTTTCAATA 197520 

TACCATACAA AATACGATTT 197 580 

TTTTCATTTT TTTATACAAA 197 640 

TCAGAAACAA TCTATTCTTC 197700 

CATATCAGCT TAAAAACATA 197760 

CACTTTTCAA GTTAAGATTT 197820 

ATTACAATTC AAAAGAAAAG 197880 

GCTTATTAAT ACCATTTACA 197940 

TACCAGAAAA ACAAGAATTT 198000 

ACATTAAAGA AAAATATCCC 198060 

CAAAATCAGA CGAAATATTT 198120 

AGAAAACAAA ATTTTATATA 198180 

TTTTTCTATT TACATTGTTT 198240 

TTAAAGCAAG CTTTTTAAAT 198300 

AATATCGAAA AATTCAAATT 198360 

AATCATTTAA CAATCTCTTG 198420 

CTTTAGAAAT TGAAAAAGTA 198480 

TCAAATAATT ATAATGCTGC 198540' 

CATTTCAATT GAAAAAAGAG 198600 

TTTAGAAAAA GAATACTATA 198660 

ACTAACAGAA CATTTTGTAA 198720 

i 

TTCAAACTAC AAAGAAGCAA 198780 

TTTAAAATAT AAAATATTTA 198840 

AATATATACA AAAAAAGGAT 198900 

TCTAAAAATA TTTAACCTTA 198960 

TGAAATGATT TTTTTCCCAA 199020 

TAAATAATTT ATTTAACAAA 199080 

GGTTTAACAA TAAATGTTTT 199140 

CCTAAAGCAG ATATCATTAT 199200 
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CACTCTAGCA TTTTTATCAA ATTCCATAAT ATTAGAAAGA CAAGTTATTC CATCCATTTT 199260 

GGGCATAGTA ATATCAAGAG TGACAATATC AATATTAGGA TAATGATTCT TGTATTTTAT 19932 0 

CACAGCCTCT TCTCCATCAG CTGCCGTATC AATAATATTA AAGCCCTCTG ATGTAAAAAT 199380 

TTGAGTAAGC TGCTTTACGG TAAAAACAGA GTCATCAACA ATTAAAACAT TAAAAGGAAT 199440 

GCCTGTATCA TAATTGATTC CTCTAGGCTT AGATGAAGAA TCTGCAGCAA TTGTAGTCTT 199500 

TTGAATCATA TTAACCTCTC TCTTCTAATA AAAAGAATTT TTTTCATATC AAACCCTCTC 199560 

TCTTATTGCA ATATTAACTT CTATAATTTT ACCATCAGGC AAAGAAAAAG GAACAATTAA 199620 

AGCCTCAGAA CCTTTATTAC TTATTTTCAT ATTTTCTCCA TAAATAAAAG CTGGGGGGGT 199680 

TATATCAAAT ACAAAACCCT TGGCATGCAA AGTGGTAACA AAATTTCCAG CAATAATATT 199740 

GCCAACCTCA GTTAGAGTTG CAGCAACCAT CTCTTTTGTT TCTTCATCGT CAAAATCATC 199800 

ATACTCTTCA AAATTTAATT TAGAAGCAAC AAAAAGAGCT GTTTCTATGT CCATATCAAT 199860 

AATTATACTG CCCTCAACAG ACCCAGCAAG CCCTACTATT ACAGAAACAC CTTTTATCTT 199920 

TTGATTTATC GACTTAAGCC CGGGCTTACC CATTTCTATA TTCTCAACAA GCAACATATC 199980 

TCTTAAAACC GAAGAAGCAG CATCCAAAAA TGGCTCTATA TAATCTATTC TCATTAATTT 200040 

CTCCTTTAGA CTTTCCTGTA CAAGTTAAAA TATTTTGTGG ATTTCTCTTT TATAAAAACA 200100 

TCATTATTTT TAAGCTCTTC GTTATCTCCC AAAACCAAAA GAGCCCCTTT AATAGCTTTG 200160 

GAAGCAATGA TATTTAAAAT TAAAATCTGA TCTTTACTAT CCAAGAAACA TAAAACATCT 20022 0 

TTTAAAAAAA CCATTCCCAA ATTATCAGGC AAATCTGAAA AAAGAGCATC GGAATATTCA 200280 

AACAAAACAT TAC TTAGGAT TTCTGACTTA AATTTATAAA CTCCGGGACT TTGTTCAAAA 200340 

GAATTCCTAC TATAAATTTC ACTAATGCCA ATTTCTGACT CTGAAAACAC CAACCTAGAA 200400 

GTTTCAACAA CTTTTGATAA ATCATTATCA ATGGCCGTCA ACTTAAAAGG TTTTACATAA 200460 

TATTCAGACA AAGCATTGGC TAAAGCCATA GTCTCCTTTC CACTACCACA ACCAATTTCC 200520 

AATACATTGA AAATAGAATT TAAGTTATTC ATAAAACTTA AGCGACTCTC AACAATTTCA 200580 

TTTTTAAATT CTTCCAAACA ATCAGCTCCC CACAAATTTC CCGATGATTT TGAATAAAAT 200640 

TCATTTAAAA AAC TATCGC A AGGCAAGTAA TCAGTATCAA CCATATTAAA TTTTACTCCA 200700 

ACTTTTTCCA AAAACACATC ATTTACCAAA GAAGCATTAA ACGAGTATTT CAAAAGATTT 200760 

TTTTTAATAT TTTCCAAATT AAAAGCTGCA GTATTGTTTA GAGTTGAATT TTCATTGTTA 200820 

CCATCATTTT TTGAAGAAAC CTTATTGGAA AAATCATTCT TAGAATTTTC TAGTATATCT 200880 

AAATTATCAC AATCGCTTAC AAAGTCGCTT TTTTCAACAA AATTTTGACC AGGTTTTAAT 200940 
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AACTTTTCTT 


CTTCACCATA 


^ffTAAAAATT 


TTAAAAACAT 


TAAGAAGTAT 


^^^AAGCTTT 


201000 


TCGTTATAAT 


CTACAACGCC 


TTTTATATAG 


TTTATTAAAG 


AATCCTGAGA 


TAAAACTGGA 


201060 


TGTGGATCTT 


GAATAAGGCT 


AGAATCTATT 


GAAAAAACAT 


TATTAATTTT 


ATCAACAATT 


201120 


ACCCCTATAA 


GAAGGTCTTC 


GTTTTTTAAA 


ACCATAATAT 


CTTCAATATC 


TTTTTTATTA 


201180 


AATTCTAAAT 


TAAACATTAT 


TCTAAGATCT 


ATAATAGGAA 


TTATTTCACC 


CCGTAAATTA 


201240 


TCAAGCCCAG 


CAACATACTT 


TTTGGCATTT 


GGAACATAAG 


TAAAATTACT 


AGATTTTCTA 


201300 


ATTTCTTTAA 


CCTGCATAAT 


GTCTACTAAA 


TAATGATCCG 


ACCCAAGCTC 


AAAAGAAACA 


201360 


ACTTTAAAAT 


CAAAATTGGT 


CAATTTAGAA 


TTAGAATTTT 


TATCATCTAA 


AATTTTGGGT 


201420 


CCAAAATAAA 


TTTCTTTTAT 


CTGCACAAGA 


GATCACTCCT 


TAGTATCCTT 


TTGTAAATCA 


201480 


AAAAGTTTAA 


AAACATCAAT 


TATCAATACA 


ACCTTACCAT 


TGCCAAGCGT 


AGTAGCCCCA 


201540 


ACTATACCCG 


CGCTTGATGA 


AAATTTATCC 


TTAATAGGCT 


TTACTACAAA 


ATCTTCCTCA 


201600 


CCAAGAATAG 


AGTCTACAAC 


AATTGCTATC 


TTCATGTTGC 


TAGTATTAAC 


AACTATTAAA 


201660 


AATTTTTCTA 


TTAATGAATC 


ATCCCTTGTT 


ATGTTAAAAA 


GTTTATCAAG 


CCTGAGAACA 


201720 


GAAATGACTT 


CATCTCTTAA 


ATTATAAACT 


TCATGATAAT 


TTTCAAGCAA TTTTATATCA 


201780 


TGTTCAGTTA 


TTCTATGAGT 


TTCAAGAACA 


TTATTTAAAG 


GAATAACATA 


AGTCTCAGAC 


201840 


CCCGACTTTA 


CTAAAAGACC 


TTGTATAATC 


ACTAACGTCA 


ATGGTAGTTT 


AATTTTAAAA 


201900 


ATTGTTCCAA 


GACCAATTTC 


TGATTCCACC 


AAAATAGTTC 


CATTAAGCTT 


TTCAATGCTT 


201960 


TTTTTCACAA 


CGTCAAGACC 


AACTCCTCTA 


CCTGAAAGGT 


CTGTCACTTG 


AACTGCTGTT 


202020 


GAAAACCCAG 


GAGCAAAAAT 


TAAGTTAATA 


AGTTCAAAAT 


CAGAGTAAAT 


TGCATCTTCT 


202080 


TTTATTGTTC 


CCTTTTCAAT 


TAATTTGCGC 


CTAATGACCT 


TTGGATCTAT 


ACCAATCCCA 


202140 


TCATCTTCAA 


TCTCAATTGA 


TATTACATTA 


CCTTCATTCT 


TGGCACGCAA 


AATTATAGTA 


202200 


CCTGCTTTGC 


TCTTTCCCCT 


TTTAACTCTC 


TCTTCAACTG 


TTTCAAGGCC 


ATGATCCATT 


202260 


GAATTTCTAA 


CACAATGCAT 


CAAAGGATCT 


ACAAGGTCAT 


CTATAACAGA 


CTTATCAAGC 


202320 


TCAGTTTCTT 


CCCCTTCCAT 


TTTAAGATTC 


ACAATCTTAT 


TTAATTTCTT 


TGAAAGATCT 


202380 


' CTTACGACTC 


TTGTAAACCT 


TGAAAATATA 


TTAGATATTG 


GTAACATTCT 


GGTTTTTAAA 


202440 


ACACTCTCAT 


GCAAATCTGT 


AATTATTCTT 


GACAGCCGCC 


CAGAGGTCAT 


TTTAAAATTT 


202500 


TGAAGAAGTC 


TGAAAAAAGA 


ATTTCTCAAT 


TCAGATATAT 


CCTTAAGAGC 


CTTTTCCATT 


202560 


TTAAAACTCA 


TCAAAGAATT 


AATATGTGAT 


TCGATCTCAT CTTCTAATGT TA&GCCTGCA 


202620 


TCTTTGAAAA 


CTATCTTTAA 


ATCAATTAAA 


AAGTTTCTCT 


GAAAACTTTC 


TTGATAATCA 


202680 


TAAAAATAAT 


TAAAATTATA 


AAATAATGTA 


ATCATTTCTG 


AATTTATTTG 


ATTATAAGAT 


202740 
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GATTTACTTA TTACAGCCTC ACTGACAAGA TTTAATATGT AATCTATTTT TTTGCTATCT 
ATTCTAATTA AATTAACACT AATTGGACTA TTTTTCTTAA TATTTTTATT TTCCTTAAAA 
GGTGCTTCAT CATCTTCTTT TAGCCTTACG CTCTTTAAAG ATTCTAAATT AACATTTTTG 
ATTTCAAAAT GACTAACAAC ATCTGGTAAA TTAATCTTTT TAGCAATACT TTCTTCACTG 
GTATTTGATA TTAAGTAATA TATTACAAAA TCAAAAAACT TATCTGCCAA TAATTCGCTA 
GAATCTGGGA TAGACTTGAA AATTTTACCA AGACTTTTTA ATGCTTGAAG CATTTGAAGC 
CCACTAATAG TAGCCATAGG ATTGTCTTTT ACAAAATCCA ATCTAACTTT AAATAACTTT 
TGATTTTCAA CGTCAAGTAA TAAATCAGAA ATCTCATCCT CTGTAAAATC AAAATTATCA 
TTTAAAACAA AATTGGAATC TAAATCAACA TCTTTAATCT CTTCATCTGC AAGCTTTTTT 
AATTCTTCTT TTACGTTAAA TTCATCAACC AAATAGCTTT CAATTAAATT TAAAGAATCT 
AAACTTTTTT TCACGCCTTC TATATCTGAA TATATCAGAT AATAATCAAC CCTTTTTAAA 
AATTTATCCT CTATGATTTG CTCATATTTA GGAATTGTAT GAAGTACAGA TCCTAAATTT 
TTTAAAATAT TAAATATTTT TAGCCCACTA TTTTCAACTT CAGAATTGCT ATTTGAATTA 
AAAACAACAC TGATCCTTAA AACCTTTTGT CCAATTCCTA GCCCTTCTCT TATCTCCTCA 
AGATCTGATT CTGAAAGACA AAAATTGTTT TTAATTGAAT TTCCATCAAA TCTCTTAATA 
AAAGTCTGAT CATCAATTAC TAAAAATTGC TTTAATTTGC TTTTAAGATC ACTTATGTCA 
TTTAAATAAA CCTTGCCATC AATACGAAGC GCAAGCATTT CCTTGATAAC ATCTAATGAA 
CTTAAAAGCA GATCAACAAG ATCATTATTT ATATTTACCT TACCATCTCT AATAGCATCA 
AAAACATCTT CGACAATATG GGTAAAATCA GATAACTCCA TCATATCAAG AGAAGCAGAG 
CTTCCTTTTA AAGTATGAGC TGCCCTGAAT ATTTCATCAA TAGTATCAGA ATTATTAGGA 
TCATCCTCTA ATGACATAAT ATTCTCTTCA AGGATATCTA CAAGATTTTG AGCTTCTTCA 
AAAAAAACTC CTAAAAGCTC TTCATTTTCC AAATCTAATA TTTCCATATA CTATTTCCTA 
TTATTTTAAT ATGTAAAGCA AACCTTTTAG GTAGCTTTAC ATATTAATTT AAACCTAATT 
TTTCGGAGAT GATTCTTCAG GTTTTTCACT CTCAATCTTT TCTACAAAGT TTTGGAAAGA 
GCCTTCAGGC ATAGAAATTT TTTCTCTAAG CTTTAAAACT CTTTTAAAAG TTTCGTGTGC 
CTTTAATTTA CGAAGGGATT CAGTTCCGCT AGTCTCATAA ACTTTAAATA CAGACTCACT 
GTCAATATCA GAATCTATTG AAACACTCAA CTTATCATAA AGAACTCTTA AATCTTTAAC 
ATAAAAGATG AAATTTTGCT CTTTTGAACT GTGTGACTTT GAAACTCTAA AAGCCTTAAA 
TCTCATTTTA CTTGAAGCAA GAGGATAATT TGGAACATCG TCTTTAATAA TTCTGGATGA 
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TATATTAGGA 
AGTACCCATA 
TATCTCATAC 
AATAAGACCT 
TGGTGGCATA 
" AATAGTATCA 
ATTTTTAACG 
CCCAAGATCT 
ATCAACATAA 
ATCTAAAACT 
TTGAGGAAAA 
GTAAACCAAC 
ATTAGTAAAA 
CAATCTTTTT 
AAAGATTGAC 
AATCATAAAA 
AAAATCTTGA 
AATCTTTTAC 
AATTAGCAAC 
CATAATTTAA 
GCACCCTAAT 
TAAAAAATAT 
AGTATTAGAA 
ATATTGGATA 
AACCATTGCT 
TTATACAAAA 
TTAAGCTCGC 
TCTCTTTTTG 
GCATCATTAG 
TTTGCAAATT 
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ATATAGTTAG ^WtTGACCA AATTAAATCA 
GAATAAGCAT ATTCCATGCC ' ATTCATATCT 
CCTAAACTAT AAACAGATAC CTTAATTTCT 
TTGCCTAAAA ATTGATTGCC ACTTTCCCCT 
ATCATAGCAG ATGATTGAGA ATAGCTTGGA 
CGTGCGTACC TTTTTGACTC ACTCTTAACA 
TAAGCCTGCA ACCTTGCAGA AGGAGTAAGT 
ACAACCATAT CTTCCGGCTT AACAATACCA 
TTTGTAAGAT CAAGTCTAGT TGAACTTGGA 
AATTCTCCAG GCTCTGCCCT TTTAGAACCC 
AGAACAGTGG ATAATAAAAA AAATAAAATA 
TCCTTATATA TATAAATCTT TTATAAAATT 
TTTAATTTAT CAACTTTGCT CCTTTAAGAT 
TATCTTACCA GGATAGTTAA AGCGCATTGT 
CTTAACATCA AAAGATTCAT TAGATTCATT 
TTCTCTTGAA AGATTTACAT AATAAACTCC 
ATCACTTAAT AAAAACCCAA AAGAAAACCC 
TAAAAGATCT AAATTATCTT TCAAATTTTG 
AAATCCCTTG CTAGAATGAA AATAAAAAAC 
AAAAACCATA CAAATAGAAA ACAAAAAGCT 
CAT ATGCTC T TTTTTTGAAT TTAAAAATTT 
ATCCGAAATA TTATTTTTCA TAAAAATTTA 
TATTAAACTT GCTCATGTAA TTATAATCCA 
AAAACCCCAT TTCAATCAAC ACAGCAGGCA 
CTTTTCTGAT TGGCCTAATA TTAGTTTCGC 
TTTCAGCCAA TCTTTTTGAT TCATATTTAT 
TTAAGTATCT ATTACCTTTA ATATCATATC 
AATCCTTAGG AAGATACCAA AACTCAACTC 
CATGTATAGA TAAAAATATA ACATTATTGG 
CCGACCGTTC TTTTAAAGTT AAATAAACAT 



GCCCACCCTT ^PRACTTTAA 204540 

TCAAATAAAA CCTCAAGATC 204600 

TTCATGGTTT TAATGTTATC 204660 

GAATAAAAAG GAATTTTAAA 204720 

AACAAAACTC TTACCCCTAA 204780 

ACAGCGGGCG CAACAACTGA 204840 

AAAACGCTCC AATTATTTAT 204900 

GAAGCGCCCG AATATACATA 204960 

TCTCTTGCAA GCTCGGCAAA 2 05020 

TCTGCTAATC CATCAGTCTC 2 05080 

CTTTTAGCTT TCCTTTTCAT 2 05140 

TTCGTTTTTT AATTATTTTT 2 05200 

ACACCCTTCA ACAAGAATAA 205260 

TTTTATTAAA GACATAGCAA 2 05320 

ATAATCACCA TTATTAAAAG .205380 

ATTCTTTAAA AAAGAATATA 205440 

TTCATTGCTT CCTAAAAGAA 205500 

TTCATCTCTT AAATATCTTA 205560 

CTTTTTTGAA AAAAGATTAT 205620 

TACTGCCAAA GACCCCAAAA 205680 

ATAAATCACT ACATTATATT 2 05740 

TAAATTCCAT TAAAGCTTTA 205800 

AAATTAATTT AGCATCTAAA 205860 

TACTGCTGTT TTTTATTACA 205920 

TTAACTCATT TTTAAACACT 205980 

ATTTAATATC TAGTATATCA 2 06040 

CCTTAAAATC TTTAATAACT 206100 

CTCTAGCTTC ACCGTTTGGA 206160 

GGAAATTTGG CTTTATTGCA 20 6220 

CATTTATACG AGTTAACAAA 206280 
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ATATTTTTAT TTACAAAATA ATTACTTAAA ATTTTAGACA AATATATAGA ATAGGTTAAT 
GCAAAATCTT TTTCCTGAAG CACAACGTCA TAACCATTTA TCTTTAAAGT CACAACAGCA 
CCAGTATCAT GCCCGCCATG TCCAGGATCA ATGATTATTG AAGTAATTCT GGGTTTATTA 
TAGTCTTTAA GAGAACTAAA ATAATTTTCA ATTTGTTTTA ATACCTTTTG ACTTATTAAA 
ATTTCTCCAC GAATGTCAAT AATTGGGTCT ACAAACATAT AATAACCAGA AGATGTAAGC 
GCATATTCAA AGCCTACCCT AAACTTCAAA TATCCCTTAT CATTTTCAAT TGTAAAAACA 
TCATTTTCAA TGTTAAAATC AAACCTAAAA ACATTAGTAT CAAAAAAATC AAGAACATTT 
AAATAATCGG GGGTCTTAGA ATACAAGCTT AAATATGAAA ACAAAATCAA ATCAATCAAT 
AATATCATTT TCCCAAAGCT CAATGGCACT CTTTAAGTCT CCTTTGAATC TTAAGCTATT 
TTTAGTAATC TCACCTTTTA TCTCCTTAAA CTCTTTTTGG AAGAGAAAAG GATTAATTTT 
GTATTTTAAA CAAATTTTGC AAAACCTTAA AAAAGAAAAA ACAACGCCAT TTCCAACAAA 
AATTGGGACA TAATTCTTAA GCAAAAAAAT CAAAAAACTT TTGCTTTTGG TTAAATTTTC 
TGGCGAATTT AATGCAGTAT ACTGGATTCT CAGCTTTCTA TAAATATAAA CATGCTCCCC 
AAAATAAATA GCTCTTAGTC CAAAATCTAT TCTTTGAAAA TATTCATTTT GAATTCTTGC 
GTCAAAACCT CCAAGCTGTA AAAATTTCTC TTTAGAATAA AGTCCGCAAT AATCCATGGT 
AATCAAAGTT TTTTCATAGT CTTTCTCAGA ATTTACTAAA ATTACCTTAA ACTTTTGCTT 
TTTATCTATG CTGGGAAGAA AAATTGAAGG AATCATTTCC TCTTCTTTAT CAAAAAACTC 
CCCACCAACA AGAAGAACAT TTTTTTTTAC TATTTCATCG AATATATTTG GAATCCAAAA 
GGGATTTAAC AAGTACATAT CACTTTGCAA AACAAAAACA AAATCACAGC TGGATTCTTT 
CATTGCTAAA TTAACCTTTT CCCCAGAATT CAAATCGTCA GAAAGTAAAA TAAATTTTAA 
CTTACCATAA CTTTCTGAAA TAAACTGCAA AGAACTTCTA TTGCTCTGTT TTTCAATTGA 
AATTATTTCT CTTATAAAGT CAAAATTTGA TAAAAATTCA AACAAATCTT CTCTAAAAAT 
TTTTGTTC C T CTGCTTAATA TTACAAAAGA AATTCCAAAA GAAGATTTTT GTGAATAATT 
ATTTTTAGAT TGAATAACAG TATATGAATA GCCACTACCT GGAAGACGCA TAAATTACTT 
TTAAAATCCT TATAATTAAA TTATAATAAT CATATGTTAC ATAATACAAT GCTAATTGCA 
AGAATAATGA ATATTAATAC ATTATTCTAC GGCATGATCA TTATCATTTT TGCACTCATT 
TCTTGCAATC ATAAGAATAT ACAGTACGAC AAGAGAATTA AAAAATTTTT AGATAAAAAC 
AAAATTGAAT ATAAAATAGA CTCAGAAAAT GACTTTATAG CATTTAAAGA TATAAACAAT 
AACGAAAAAG AAGAAGTAAT CATCAGATCA AGACTAAACT CATATAAAAA TTCAAAGATA 
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AGAGAAATAT TTGGAATTGT^SkAGTATTT GATATAAACA 
TCTGACTCGC TTATGAGCGA TAGTTATAAT AACAGAGTAT 
CATAATGCAG AAAGAGGAAT CAACTCTTTG GTATATATTG 
AATGATACAT TTTTGCTTGA TGCAATTGAT GAGATTGCCT 
AAAATAATAA CAACCAACAA CGAAAACATT GATAATAATG 
GAATCAAATG AACAGCCCAC CTTAAAGCAA GAAAAAACAA 
AACGAACTTA AAGAAGATCA AATAGAAGAA GAACTTCAAG 
CAAAATCATT CTACTAATAA AGAATTAACA TCAAAGCAAA 
TTTATTTGAA ATTTAGAAAA TGAGCCCCTT GGAAGCTCAA 
AGAGAATTAA CATTTGCCCT AGAGTACGGC TCTAAATCAT 
CTTGAATCAA TATAAGCTAT TTCAAGCAGC AAAGGTGTAT 
TTTTGATCTT TTTTAAAAAC AAAAAGCATT CCATTGCCAT 
ATGTAACCTT TTGCCCTATC AAGCTCATTA GATGCTATTT 
TTTATCATAA TTTCTTTATC GTAAAAATGA TCAGCAAAAG 
ACTAAAAACA AAAACCGTTT TAAAATTTTT TTCAATTATC 
ATTATAATTT GAAATATAAG ATTTTAAAGT AATTCTTAAA 
AATAGAATCA CCAAGATCCC ATAAATAATT CAAAGGCGTT 
CTTTTTCTTT ACCTCAATAA AAATTGCATA CTGAGATATT 
AGTGTACGAA AACATTTTAT TTTTGTCAAA ATTAAAAACA 
AGCAATAAAT TGTAAAGTCT GCTTCCTAGC ATAAGGAAAT 
ATAAAAAATA GTGCTCTTGA GAACATTCTC TTTAACAGAA 
AAAAGATGCA AAAATAGAAA ATAAATAAAA AACTGTAATA 
ATAAACAATA TAAAGAAAAA GTATAACAAA TAAAAAACTT 
ATCAATATTA ATCCCAAAAT GGAAAAGCCT TACAAAGGTC 
TACCATTCTT GCTCTTTCTT TTCTTTTAAA GGGCTTTTCT 
GTAGTGACAA TTAGGACAAC CTTCTCCAAA AGCAGAAACT 
TGGACATTCG ATATCACCAA GCTTAGCAGC ACAGTTGGGA 
AACTTTTTCG £CACATTGCT CGCAAAAAAC TTCAAAATTT 
CCTCAAATTT TACTTAAAAA TAAGTATTTA AAATACTATA 
TATATGAATA TTACATATTT AAGGAATACT AAAACATGAA 



\'^^AC 



CACCAAAAAtf^^AAGAAATA 208080 

TTGGATCGTG GGAGATTATT 2 08140 

TAAAAGCAGA AGAATTTGCA 2 08200 

CAACAATAAG TATTTTCAAA 208260 

AAGAAAATAA CAATACAAAT 208320 

ATTCAACAAA AGAATCTAAT 2083 80 

AAATCAAAGC CCAATAATTT 208440 

AATGAACCTT GTCACCTATT 2 08500 

GAGCATACCT TACCTTATAA 208560 

AAATCTCTTT AATAATTCCA 208620 

TTTCCATCCA AAAAGACAAA 2 08680 

ATTCAACTTT TTGAGCACCG 208740 

TTACAAAAAA CTTAACCCCA 208800 

ATAAAAAAGA CATCGACAAA 2 088 60 

AGCCTTATTA AAAATCATTT 208920 

ATATTTTTAT TTAAAACAAT 208980 

ACTAGTGTAG CCTCGCCAAA 2 09040 

AATTTTTTAT CAAAAACAAA 2 09100 

GCATATTTTA AATGGGTTTT 2 09160 

TCAACTTCTT CAACATCATA 2 09220 

TCCATGCTAG AACCCAGAAC 209280 

CCCATACAAA GCCTTTAGCC 209340 

GAAAATAAAT AAAGCAAAAT 209400 

TTAGAAGCTG CTCTATCACT 2 09460 

TTAAGTTCTT GAAAAGGACT 2 09520 

GGACCTACAT GCCTACAATT 209580 

CAAACAGATC GATTAAGTCC 209640 

ACCTTTGCCA AACGAAAATT 2 09700 

TAATTAATTA TAATAAAAAA 2097 60 

ATCGGGATTT GCAGCAATAC 209820 
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TTGGTAGACC ATCAACTGGA AAATCTACCC TTTTAAATTC 
CAATAATATC CCCTATTCCG CAAACAACTA GAAATAATAT 
ACAGAGGACA AATTATTTTT ATAGACACAC CGGGATTTCA 
ATATTGCAAT GATGAAAAAT ATCCACTCTT CAATAGGAGA 
TAATAGACAT TCAAGACAAA CCTGGAGAAG AAGAAAATAA 
ACTCTAAAAT TAAATTTTTA GTAATACTTA ATAAAATTGA 
AAGAAATAAC GCAATTTCTA AAAGAAAAAG GAATAGAAGA 
CTGCTGAAAA AAAAATTAAC ACAGAAGAAC TAAAAAATAA 
AAGGCCCACT TTATTATCCA CAAGAATACT ACACCGATCA 
GTGAAATAAT AAGGGAAAAA GCTATTGAAA ACCTAAAAGA 
ATGTGGATAT TGATACCTTA GAAAATAAAA AAGGAAGTCT 
TTGTAGCCAA TGAAAGTCAA AAAGGAATAA TTGTAGGAAA 
CAATAGGAGA AAGGGCAAGA AAAACAATTG CAAAAATTTT 
TCTTACAGGT AAAACTTAAA AAAAATTGGA ACAAAGAAGA 
TAAATTAACA AACATTAAAC TGCATTTTTT TAAATTCTTG 
TAAAATTTAC CTAAATTTAA ATTAGGAATA AAATGTGAAA 
TTACGCAGAA AAAATAAAAA AAGAAAAAGG TCCAAAAAAC 
AATTACTCCA TCTGGAACTG TGCACATTGG CAATTTTAGA 
TGTAGCAAGA GCACTAAGAG ACTCTGGATC AAAAGTAAGG 
TTACGACGTA TTTCGAAAAG TTCCCAAAAA TATGCCAGAA 
TTTAAGACAA GCAATAACAA GGGTCCCTGA CACAAGAAGC 
GGCTAATGAA ATTGAATTTG AAAAATATCT GCCTGTAGTT 
CGACCAAAGC AAACAATATA CCAGCAACGC TTATGCAAGC 
TCATAAAAAA GAACTGTCTG AAGCATTAAA CGAATACAGA 
TTGGTATCCA ATCAGTGTAT TTTGTACAAA ATGCAATAGA 
TTATGACAAT CATTACTCTG TTGAGTATTC ATGTGAATGT 
CATAAGAACC ACATGGGCCA TTAAACTTCC TTGGAGAATA 
AT ATGAAAAA GTTGAC TTTG AGCCTGCAGG AAAAGACCAC 
TGATACATCT AAAAATATTG TAAAAATTTT TCAAGGTAGC 
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AATATGCGGA CATAAAATAT 209880 

AAAAGGAATC TTTACGGACG 2 09940 

TCTGAGTAAA AAAAAGTTTA 210000 

AGTTGAAGTC ATTTTATACA 210060 

AATGTTAGAA ATAATTAAAA 210120 

CCTTAAAAAC ACAAAAATAA 210180 

TAGTAATATA ATTAAAATAT 210240 

AATTTATGAA AATTTTTCAG 210300 

AGAAATAAAT TTTAGAATTA 2103 60 

AGAACTCCCC TATTCTTTGT 210420 

TTTTATCAGA GCAAATATTT 210480 

AAACGGAAAA GAAATAAAAT 210540 

TGAAACAAAA TGCAACCTAT 210600 

TAAGCTAATA AAAAGACTTA 210660 

AAACTTGAAA AACAAAATGC 210720 

ACAGCACACT GGGCAGATTT 210780 

TTATACACAG TAGCATCGGG 210840 

GAAGTTATTT CGGTAGACCT 210900 

TTTATTTATT CTTGGGATAA 210960 

CAAGAACTTC TTACAACTTA 211020 

CACAAAACAA GTTATGCAAG 211080 

GGGATCAATC C TGAATTC AT 211140 

CAAATAAAAT TTGCACTTGA 211200 

ACCTCAAAGC TTGAAGAAAA 211260 

GACACAACAA CTGTAAATAA 211320 

GGAAATCAAG AATCTCTAGA 211380 

GATTGGCCTA TGAGATGGAA 211440 

CACAGCAGTG GCGGCAGTTT 211500 

CCTCCTGTAA CATTTCAATA 211560 




